We provide IT Staff Augmentation Services!

Etl Developer Resume

0/5 (Submit Your Rating)

Minneapolis, MN

SUMMARY

  • 6+ years of experience in IT industry comprises of development, maintenance and support, mitigation projects in Big Data, ETL, Machine Learning and Natural Language Processing technologies.
  • Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
  • Experience in migrating data to and from RDBMS and unstructured sources into HDFS using Sqoop& Flume.
  • Experience in developing and implementing Map Reduce jobs using python to process and perform various analytics on large datasets.
  • Experience in developing PigLatin and HiveQL scripts for Data Analysis and ETL purposes and also extended the default functionality by writing User Defined Functions (UDFs) for data specific processing.
  • Experience with Informatica 8.6x and above (Source Analyzer, Mapping Designer, Mapplet Designer, Transformations Designer, Warehouse Designer, Repository Manager, and Workflow Manager/Server Manager).
  • Extensively worked with different data sources non - relational databases such as Flat files, XML files, and other relational sources such as Oracle, Sql Server and DB2.
  • Extensively worked in Extraction, Transformation and Loading of data from multiple sources into Data Warehouse.
  • Experience in building models using various supervised and unsupervised learning algorithms.
  • Proficiency in Machine learning techniques for feature extractions, statistical and probabilistic modeling of data and classifier techniques in Pattern Recognition problems.
  • Good knowledge in job scheduling and monitoring through Oozie, Autosys and ZooKeeper.
  • Experience in NoSql Databases like HBase.
  • Good experience in Python, UNIX and shell scripts.
  • Good knowledge in tuning the performance of SQL queries and ETL process.
  • Experience in using open source such as NLTK, Scikit, R and Matlab.
  • Experienced in working with tools like TOAD, SQL Server Management studio and SQL plus for development and customization.
  • Experienced in working for the post development cycle and applications in Production Support.
  • Excellent analytical and problem solving skills.
  • Effective working relationships with client team to understand support requirements, and effectively manage client expectations.
  • Excellent interpersonal and communication skills, technically competent and result-oriented with problem solving skills and ability to work effectively as a team member as well as independently.

TECHNICAL SKILLS

Technologies: Big data, Machine Learning, ETL, NLP, Pattern Recognition.

ETL Tools: Informatica Power Center 9.x/8.x, Informatica Power Exchange, Reporting Service, Metadata Manager.

Languages: Hive, Mat-Lab, Oracle, Perl, Python, Pig,R, SQL, SQL, UNIX Shell Script.

Databases: Oracle 11g/10g, DB2 8.0/7.0, Confidential SQL Server2008/12, HBase

Environment: WindowsXP/ 2008/2003/2000/ NT/98/95, UNIX, LINUX.

Other Tools: Autosys, Sqoop, Flume, Oozie, Zookepper, IBM Attila,Weka, NLTK,Scikit, SRILM, Toad.

PROFESSIONAL EXPERIENCE

Confidential, Chicago, IL

Hadoop Developer

Responsibilities:

  • Building framework for storing and processing input data from various resources.
  • Exclusive experience in Hadoop and its components like HDFS, Map Reduce, Apache Pig, Hive, Sqoop, HBase and Oozie
  • Extensive Experience in Setting Hadoop Cluster
  • Good working knowledge with Map Reduce and Apache Pig. Involved in writing the Pig scripts to reduce the job execution time
  • Created Hive tables to store the processed results in a tabular format
  • Developed sqoop scripts in order to make the interaction between Pig and MySQL Database
  • Writing the script files for processing data and loading to HDFS. Writing CLI commands using HDFS.
  • Developed the UNIX shell scripts for creating the reports from Hive data.
  • Analyzing the requirement to setup a cluster. Moved all log/text files generated by various products into HDFS location

Environment: Red Hat Linux, HDFS, Map-Reduce, Hive, Pig, Sqoop, Flume, Zookeeper, Oozie, Python, HBase, MRUnit.

Confidential, New Jersey

Hadoop Developer

Responsibilities:

  • Developed framework related components and Hive analytical queries to extract business critical information as per the business requirements.
  • Developed the custom record reader to handle specific inputs.
  • Developed the Hive scripts for processing the logs to identify the critical user information like no. of shares purchased.
  • Created Hive queries to determine the Sell to cover information.
  • Scheduled the workflows using the Oozie workflow scheduler.
  • With Map-Reduce, focused on finding frequency of stocks options each year.
  • Improved the Hive queries performance by implementing partitioning and clustering.
  • Importing &Exporting data from web servers to HDFS, Hive.
  • Used Flume to collect, aggregate and store the web log data from different sources like web servers and pushed to HDFS.

Environment: Hadoop, CDH4, Hive, SQOOP, OOZIE, Python, Unix Shell Scripting, Map Reduce, Hbase, PIG and Flume.

Confidential - Minneapolis MN

ETL Developer

Responsibilities:

  • Prepared technical design/specifications for data Extraction, Transformation and Loading.
  • Worked on Informatica Utilities Source Analyzer, warehouse Designer, Mapping Designer, Mapplet Designer and Transformation Developer.
  • Analyzing the sources, transforming data, mapping the data and loading the data into targets using Informatica Power Center Designer.
  • Created reusable transformations to load data from operational data source to Data Warehouse and involved in capacity planning and storage of data.
  • Used Variables and Parameters in the mappings to pass the values between mappings and sessions.
  • Used data miner to process raw data from flat files.
  • Created Stored Procedures, Functions, Packages and Triggers using PL/SQL.
  • Implemented restart strategy and error handling techniques to recover failed sessions.
  • Building Reports according to user Requirement.
  • Used Unix Shell Scripts to automate pre-session and post-session processes.
  • Did performance tuning to improve Data Extraction, Data process and Load time.
  • Wrote complex SQL Queries involving multiple tables with joins.
  • Implemented best practices as per the standards while designing technical documents and developing Informatica ETL process.

Environment: Informatica Power Center 8.x, Oracle 10g, SQL Server 08, Autosys, Toad 9.0.1, Unix, SQL Developer, SQL..

Confidential, Camp Hill, PA

Informatica Developer

Responsibilities:

  • Analyzed the source data, Coordinated with Data Warehouse team in developing Relational Model.
  • Designed and developed logical and physical models to store data retrieved from other sources including legacy systems.
  • Extensively used Informatica Power Center 8.1.1 to extract data from various sources and load in to staging database.
  • Interacted with business representatives for Need Analysis and to define Business and Functional Specifications. Participated in the Design team and user requirement gathering.
  • Performed source data analysis .Primary data sources were from Oracle & SQL server 2005.
  • Extensively used ETL to load data from multiple sources to Staging area (Oracle 10g) using Informatica Power Center 8.1.1
  • Performed migration of mappings and workflows from Development to Test and to Production Servers.
  • Involved in the Development of Informatica mappings and mapplets and also tuned them for Optimum performance, Dependencies and Batch Design.
  • Worked with pre and post sessions, and extracted data from Transaction System into Staging Area. Knowledge of Identifying Facts and Dimensions tables.
  • Participated in all facets of the software development cycle including providing input on requirement specifications, high level design documents, and user’s guides.
  • Tuned sources, targets, mappings and sessions to improve the performance of data load.
  • Involved in Unit testing and documentation.

Environment: Informatica 8.1.1, Power Exchange, Oracle 10g, PL/SQL, Toad 9.4, SQL Server 2005,Windows NT, UNIX Shell Scripting.

Confidential

Software Engineer

Responsibilities:

  • Developing Speech Recognition engine by acoustic and language model training & testing in IBM Voice Tailor, Shell, and PERL in Linux environment.
  • Developing Voice Mail to Text engine for Spanish and French languages.
  • Responsible for complete development from architecture to production.
  • Coordinating project schedules, architecture, design, build and release schedules to ensure timely delivery to internal and external customers.
  • Involved in coding, testing and implementing applications.
  • Punctuation Generation: Independently created a Punctuation generation model for both English and French Voice Mail messages.
  • The system was built after conducting various experiments using different feature selection mechanisms and machine learning algorithms.
  • Cluster based Language Modeling: Research was conducted by building Language models using various clustering techniques to reduce perplexity in mixed domain data.

Environment: Linux, IBM Attila, SRILM, Shell Scripting, Python, R, Mat-Lab.

We'd love your feedback!