Hadoop Developer Resume Profile
2.00/5 (Submit Your Rating)
SUMMARY:
- IT professional , Over 8 years' experience in System Analysis, Design and Development and Extensive experience in IBM DataStage v 8.7/8.1/8.0/ 7.x using Components like Administrator, Manager, Designer, and Director.
- Over Seven years in the fields of data Warehousing, Data Integration, Data Migration using IBM Websphere DataStage, Oracle, PL/SQL, DB2 UDB, SQL Server 2000/2005. SQL procedural language and Shell Scripts
- Around Seven years of experience in ETL methodologies in all the phases of the Data warehousing life cycle
- Experience in Data ingestion into HDFS using Hadoop ecosystem SQOOP, FLUME and performing data transformation/analysis using PIG and HIVE.
- In Depth knowledge in Data Warehousing Business Intelligence concepts with emphasis on ETL and full Life Cycle Development including requirement analysis, design, development, testing and implementation
- Expertise in all the phases of System development life Cycle SDLC using different methodologies like Agile, Waterfall.
- Have good grasp of data warehousing fundamentals and have proven ability to implement them. Conversant with ETL processes.
- Worked with SQL, SQL PLUS, Oracle PL/SQL, Stored Procedures, Table Partitions, Triggers, SQL queries, PL/SQL Packages, and loading data into Data Warehouse/Data Marts
- Excellent experience in most of the RDBMS including Oracle 10g/9i/8x, SQL Server 7.0/6.5 DB2 8.1/9.0
- Extensively used DataStage- Designer to design and develop Server and PX jobs to migrate data from transactional systems Sybase, DB2UDB into the Data Warehouse.
- Extensively used DataStage- Manager to Export/Import DataStage Job components and Import Plug in Table Definitions from DB2UDB, Oracle and Sybase databases.
- Designed Server jobs, Job Sequencers, Batch Jobs and Parallel jobs. Handled multiple pieces of a Project.
- Experience in writing UNIX Shell scripts for various purposes like file validation, automation of ETL process and job scheduling using Crontab.
- Designed Parallel jobs using various stages like join, merge, lookup, remove duplicates, filter, dataset, lookup file set, modify, aggregator, CFF, Transformer, XML and MQ plug in stages.
- Good Experience in Extraction Transformation and Loading ETL processes using Datastage ETL Tool, Parallel Extender, Metastage, Quality Stage, Profile Stage
- Developed Server jobs using various types of stages like Sequential file, ODBC, Hashed File, Aggregator, Transformer, Sort, Link Partitioner and Link Collector.
- Experience in integration of various data sources like Oracle, Teradata, DB2, SQL Server, MS Access and Flat files into the Staging Area. Extensively worked with materialized views and TOAD
- Proven track record in troubleshooting of DataStage Jobs and addressing production issues such as performance tuning and enhancement
- Excellent knowledge of studying the data dependencies using Metadata stored in the Repository and preparing batches for the existing sessions to facilitate scheduling of multiple sessions
- Excellent analytical, problem-solving and communication skills
TECHNICAL Skills :
- Data Warehousing : IBM DataStage 8.7/8.1/8.0/7.5.3/7.5.2/7.5.1 Designer, Director, manager, Administrator , Parallel extender, Server Edition ,MVS Edition, Quality Stage , ETL, OLAP, OLTP, SQL Plus, Business Glossary, Fast Track, Information Analyzer, Metadata Workbench
- Business Intelligence : OBIEE, Crystal Reports, Hyperion, Cognos 7.0.
- Databases :Oracle10g/9i/8i/8.0/7.0,DB2UDB7.2/8.1/9.0,Mainframe,TeradataV2R6 13, MS SQL Server 2005, 2008
- Tools : SQL Loader, Data Version control, Autosys, Excel, Control M, TOAD, SQL Navigator 6.5
- Programming : SQL, PL/SQL, C, C , VB, XML, Java, J2EE, DOS, COBOL, UNIX, Korn, Shell Scripting, Perl Scripting, Python
- Analysis Design : Agile, Rational Unified Process RUP , UML, Waterfall
- Others : HTML 4.0, MS Excel, MS Office,
- Environment : Sun Solaris , IBM AIX 5.3/5.2/4.2, MS DOS 6.22, Win 2000, Win NT 4.0,Win XP
PROFESSIONAL EXPERIENCE:
Confidential
Sr. ETL Developer/Hadoop Developer
Responsibilities:
- Created a process to pull the data from existing applications and land the data on Hadoop.
- Used sqoop to pull the data from source databases such as Oracle RMS database, DB2 Ecom database.
- Created the Hive tables on top of the data extracted from Source system.
- Partitioning the Hive tables depending on the load type.
- Created the hive tables to show the current snapshot of the source data.
- Created the Datastage jobs to load data from ECOM database to ODS to Business Intelligence layer.
- Developed the solution for creating the generic jobs in Datastage to load 300 source tables into current ODS layer Netezza .
- Created reusable components in datastage to pull data from different source systems into ODS.
- Developed the jobs to load 29 dimensions and 10 Fact tables related to ECOM into Business intelligence layer.
Environment: Hadoop, MapReduce, HDFS, Hive, Java jdk1.6 , IBM Big Insights, IBM InfoSphere Datastage 8.7, UC4, Shell Scripts,WinXP, UNIX and Netezza, Oracle and SQL Server 2008. PL/SQL.
Confidential
Hadoop Developer
Responsibilities:
- Created a process to pull the data from existing applications and land the data on Hadoop.
- Worked in agile environment, involved in sprint planning, grooming and daily standup meetings.
- Responsible for meeting with application owners for defining/planning of Sqooping the data from source systems.
- Used sqoop to pull the data from source databases such as Teradata, DB2, and MS SQL server.
- Created the Hive tables on top of the data extracted from Source system.
- Created Hive and PIG UDFs using java for data transformations and implement date conversions.
- Partitioning the Hive tables depending on the load type.
- Worked with AVRO and Sequential files formats.
- Created MapReduce programs for data transformations.
- Responsible for creating PIG scripts for data transformations.
- Responsible for creating the Datameer links for data Visualization using Datameer.
- Assisted business in validating and analysis of the data.
- Created the shell wrapper scripts for the Sqoop, Hive and MapReduce jobs.
- Deployed and scheduled the tested Sqoop, Hive and Datameer jobs using Autosys.
- Experienced in managing and reviewing Hadoop log files
- Created work flows using Oozie.
- Good understanding of Hadoop architecture and knowledge of NoSQL databases Cassandra and Hbase .
Environment: Hadoop, MapReduce, HDFS, Hive, Java jdk1.6 , Pig, Datameer, UNIX, Shell scripting, Teradata, DB2, MySQL, Autosys, Oozie.
Confidential
Sr. Data Stage Consultant / Hadoop Developer
Hadoop experience:
- Importing and exporting data into HDFS and Hive using Sqoop
- Experienced in managing and reviewing Hadoop log files
- Created components on Hive/Pig for converting Fixed Length Ascii files to hive tables. Load and transform large sets of structured, semi structured and unstructured data
- Responsible to manage data coming from different sources
- Supported Map Reduce Programs those are running on the cluster
- Understanding of Cluster coordination services through Zoo Keeper.
- Involved in loading data from UNIX file system to HDFS.
- Good understanding of Installation and configuration of Hive and also Hive UDFs.
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map way
- Automated all the jobs for pulling data from FTP server to load data into Hive tables, using Oozie workflows.
ETL Responsibilities:
- Involved in requirement gathering, analysis and study of existing systems.
- Involved in preparing technical design/specifications for data Extraction, Transformation and loading.
- Lead a team of four developers and participated in daily scrum meetings, created ETL solution for complex business requirements.
- Written Stored Procedures, Functions and packages to modify, load the data and create extracts.
- Written Teradata Mload, FastLoad and Fast Export Bteq scripts for loading, modifying and creating exports of data.
- Extensively used Datastage Designer to develop various jobs to extract, cleanse, transform, integrate and create extract files as needed.
- Developed complex Teradata SQL which involved many tables and calculated the summary values as necessary.
- Used Information Analyzer for column analysis and written Data rules for quality check.
- Used Business Glossary and Fast track for ETL mapping and to link business terms with technical terms and solutions.
- Also used Metadata Work Bench for impact analysis of existing Data model.
- Also worked as part time admin, involved in Datastage configuration, ODBC connections creation, Assigning roles to the users, monitoring the system and killing process if needed.
- General cleanup and maintenance of the Datastage server.
- Developed a generic shell script to wmqfte files and initiate file transfer between two servers.
- Scheduled the jobs using Autosys Scheduler, which would trigger ETL jobs and invoke wmqfte shell scripts to initiate file transfer between two servers.
- Involved in writing Jil Script's to create Autosys Jobs to trigger ETL jobs and Shell Script.
- Created Technical Specs document for the Datastage Jobs, Developed several Test Plans and Error Logs / Audit Trails were maintained.
- Implementing performance-tuning techniques along various stages of the ETL process.
- Following up deployment process of Datastage code migration on different environments
- Development, test and production with admin team.
- Co-coordinating with client managers, business architects and data architects for various sign offs on data models, ETL design docs, testing docs, migrations and end user review specs.