We provide IT Staff Augmentation Services!

Hadoop Developer Resume

2.00/5 (Submit Your Rating)

Charlotte, NC

SUMMARY:

  • IT professional with 8 plus years of experience Software Development, Design and Validation, specialized in Mainframes, Java, Databases and Hadoop Technologies. Also, have gained experience in various industries including Engineering, Service Delivery and Customer Relationship Management.
  • Strong experience in Big data & projects in multiple domains, tools in all phases of SDLC: Requirements gathering, System Design, Development, Enhancement, Maintenance, Testing, Deployment, Production support, System.
  • Strong experience in Big Data and Hadoop Ecosystem tools like MR, PIG, HIVE, SQOOP, OOZIE, FLUME, HBASE and SPARK.
  • Working on different file formats like JSON, XML, CSV, XLS etc.
  • Using Amazon AWS EMR and EC2 for cloud big data processing.
  • Good understanding of HDFS Design, Daemons, Name node Federation and HDFS high availability (HA).
  • Good understanding on Spark core, Spark SQL and Spark Streaming and Kafka.
  • Knowledge of NoSQL databases such as HBase, and DynamoDB.
  • Experience in Knowledge of UNIX and shell scripting
  • Processes using hundreds of terabytes of data loaded into corporate data warehouse to build data visualizations for business analytics team.
  • Solid understanding of the high volume, high performance systems.
  • Worked on Integration Manager of SQOOP import and export.
  • Hands - on experience in scheduling jobs on Autosys, Oozie, Ca7.
  • Very good understanding on Performance tuning and Query optimization techniques.
  • Excellent knowledge on YARN architecture
  • Good Knowledge on Data warehousing concepts.
  • Experience in Agile Development environments
  • Hardworking professional with a strong ability to work well in a team environment. Exceptional time management skills with a strong work ethic.
  • Good knowledgeon creating buckets in S3 AWS for storing the input and output files.
  • Writing Sqoop scripts to make the interaction between databases.
  • Possess superior design and debugging capabilities, innovative problem solving and excellent analytical Skills.
  • Focused on Quality and processes. Excellent written and verbal communication skills and team player.
  • Have flair to adapt to new software applications and products, self-starter, have excellent communication skills and good understanding of business work flow.

TECHNICAL SKILLS:

Big Data Ecosystem: Hadoop, Map Reduce, YARN, Hive, HBase, Flume, Sqoop, Impala, Oozie, Zookeeper, Cloudera SPARK, Scala, Kafka, Hue, DMEXPRESS-H

Programming Languages: C, C++, Data Structures, Java, SQL, Pig Latin, HiveQL and JCL, COBOL, easytrieve, Rexx, VSAM

DB Languages: Teradata SQL assistant, SQL Server, MySQL, Oracle, DB2, Mongo DB

Operating Systems: Windows, MS-DOS, UNIX/Linux, Z Os

IDE: Eclipse, TOAD, Microsoft Visio, Atom

Methodologies: Waterfall, Agile, UML, Design Patterns

Version Control Systems: SVN Tortoise, Endevor, Changeman

Build Tools: Maven,

Planning: Effort Estimation, Project planning.

Issue Tracker: Atlassian Jira, Remedy

Scheduler: Autosys, Oozie, Ca7

PROFESSIONAL EXPERIENCE:

Confidential, Charlotte, NC

Hadoop Developer

Responsibilities:

  • Developed Sqoop job to pull PRDS (party reference data) to the HDFS location from Teradata
  • Creation of EMR Cluster in AWS.
  • Loading the data into Redshift instance in AWS RDS.
  • Creating instances in RDS like MySql,Oracle and loading the data into those instances and pulling the data stored in those instances.
  • Creating Buckets in S3 and copying the data and retrieving the data stored in AWS storage S3.
  • Experience in Oozie and workflow scheduler to manage Hadoop jobs with control flows.
  • Loading Data into HBase using Bulk Load and Non-bulk load.
  • Load the data into Spark RDD and do in memory data Computation to generate the Output response.
  • Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, and loaded final data into HDFS.
  • Developed Spark scripts by using Scala Shell commands as per the requirement.
  • Involved in converting Map Reduce programs into Spark transformations using Spark RDD's on Scala.
  • Writing Sqoop scripts to make the interaction between databases.
  • Import the data from different sources like HDFS/HBase into Spark RDD.
  • Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark -SQL, Data Frame, Pair RDD's, Spark YARN.
  • Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Worked on analyzing Hadoop cluster using different big data analytic tools including Flume, Pig, Hive, HBase, Oozie, Zookeeper, Sqoop, Spark and Kafka.
  • Analyzing Business Requirements Document and Functional Specification document to develop detailed Test Plan and Test Cases
  • Involved in working with Spark on top of Yarn/MRv2 for interactive and Batch Analysis. 
  • querying.
  • Developed Spark jobs using Scala in test environment for faster data processing and used Spark SQl for
  • Data Frames are created by reading the validated Parquet Files and run the SQL queries using SQLContext to get the common transaction data from all the systems.
  • The validated parquet files of two or more systems got combined in curation module to get the common transactions data.
  • Storing the parquet data into hive data base with daily date partitions for further queries.
  • Create parquet files for valid records and invalid records separately for all systems.
  • Implemented Repartition, Caching and broadcast concepts on RDD’s, DF’s and variables to achieve better performance on cluster.
  • Files types Delimited, Position Based and Binary files are loaded in to SparkContext and validated against xml.
  • Prepared xmls for each source system like ATM, Loans, Teller etc to validate each record from HDFS source file and these xmls are validated by XSD.

Environment: Flume, HBase, Spark, Scala, Intelli-J, Maven,Kafka,SQOOP,HIVE,Oozie,Autosys,Unix,Teradata,PIG, SVN Tortoise,

Confidential, Hartford, CT

Hadoop Developer

Responsibilities:
  • Design and migration of existing Teradata system to Hadoop.
  • Worked on Hive and Pig extensively to analyze network data.
  • Automation of data pulls from SQL Server to Hadoop eco system via SQOOP.
  • Development of Unix Shell Scripts, to prepare Environment for Application and to delete all the
  • Scheduling the jobs using Autosys based on various success conditions usage of filewatchers.
  • Developing various ETL transformations using DMX-h tool.
  • Developing complex ETL mapping and its corresponding sessions and workflows.
  • Development of Oozie workflows for launching the jobs.
  • Written Sqoop commands for exporting and importing data in HDFS.
  • Developing compression scripts using various compression techniques and codecs Gzip, Bzip2 etc.
  • To set up standards and processes for Hadoop based application design and implementation.
  • Developed Hive scripts for data transformation and aggregation.
  • Development of Map Reduce code to process the Input SOR files.
  • Responsible for Hadoop ETL design, development, testing and review of code.
  • Generated ad-hoc reports using Hive to validate customer viewing history and debug issues in production.
  • Delivered Hadoop migration strategy, roadmap and technology fitment. 
  • Wrote Hive queries for data analysis and query optimization Temporary files.
  • Hands on experience in Importing and exporting data from different databases like MySQL, Oracle, Teradata into HDFS using Sqoop 

Environment: SQOOP,HIVE,Oozie,Autosys,Unix,Teradata,PIG,Core JAVA Eclipse,DMX-h tool,SVN Tortoise

Confidential, Jacksonville, FL

Mainframe/ Teradata Developer and Module Lead

Responsibilities:

  • Ensure an excellent quality of deliverables and efficiency in meeting deadlines for all assignments
  • Enhancing/Changing existing applications based on business requirement which involves impact analysis and code changes to existing scripts.
  • Developing, testing and implementing new jobs and scripts that involve huge business transformational logic using all Teradata utilities including TPT.
  • Analysis, design, testing and control for entire projects that convert Legacy Mainframe - Teradata applications to Hadoop - Teradata applications that involve complex CDC logic.
  • Performing regression testing for capturing compatibility issues and identifying defects due to new business changes in data.
  • Extracting and analyzing performance metrics for all Teradata queries being added/changed to minimize performance impact and reduce system resource consumption.
  • Analyzing Business Requirements Document and Functional Specification document to develop detailed Test Plan and Test Cases.
  • Building tables and corresponding views based on user requirements and testing the same for compliance and business requirement.
  • Tuning queries in existing applications for better system performance and minimizing application run time.
  • Provided solutions to Teradata and Mainframes related issues and errors.
  • Successfully managed the team of four as Module Lead and trained them in Teradata and Mainframes.
  • Involve in Post implementation support.
  • Assist in Deployment and provide Technical & Operational support during Install.
  • To do analysis and provide data to Onshore Counterpart for submitting it to the User community.
  • Perform Peer Review and Code Walkthrough
  • Develop the code according to the LLD using TCS tool like the SAS-RAW DATA LOAD (RDL).
  • Compare Client Supplied products like BRD, HLD with the LLD to find out any incompleteness.
  • To do an impact Analysis for the New/Changed Requirements and prepare LLD (Low Level Design).

Environment: ZOS,TERADATA,COBOL,VSAM,EASYTRIEVE,REXX,CA7,JCL,CHANGEMAN,ENDEVOR, FILEMANAGER,FILEAID,DEBUG TOOL,EXPEDITOR

Confidential

Mainframe Developer

Responsibilities:

  • Understand the business needs and objectives of the system and interacted with the end client/users and gathered requirements for the integrated system.
  • Planning and providing permanent fixes for job flows with regular abends.
  • Post production support for warranty period
  • Scheduling of batch jobs using Ca7 Scheduler.
  • Migration of code to production using version control tool such as Change man and Endevor
  • Involved in developing Easytrieve coding for reducing the coding efforts and areas where less data being handled.
  • Monitoring the regular job flows and fixing the abends and handling emergency fixes of severity1.
  • Preparing test cases test plan and performing unit testing and regression testing of complete application flow which includes 300 jobs.
  • Designing Tools using REXX for creating job setups in no time.
  • Coding the MQ’s for interacting with upstream applications.
  • Optimization of already existing COBOL codes with effective usage of sort cards.
  • Developing COBOL code using VSAM (Virtual Storage Access Memory) for key sequenced datasets.
  • Co-ordinating with scheduling team for implementation of PODS requests and verifying the scheduled jobs.
  • Designing the batch job flows or designing the complete application flows with Visio Editor.
  • Developing complicated COBOL-DB2 programs using cursors with less utilization of system resources.
  • Sending the processed files to downstream applications using NDM/FTP process.
  • Developing various batch jobs using JCL to submit the instructions on Z OS.
  • Developing the code in COBOL as a midrange application, extracting and parsing the data from various upstream applications such as Unix server.
  • Developing the Code using COBOL, JCL, VSAM, Easyrieve
  • Preparation of HLD, LLD and BRD (Business requirement document).
  • Worked on Major integrated release ADDP (ATM Debit Detection Platform).
  • Designing of almost 12 applications namely AMF, AMW, AM-Transit (Transit Check Fraud), KDM (Kite Detection Monitoring), V12-Signature Verification System, SNS-Extract.

Environment: ZOS,COBOL,VSAM,EASYTRIEVE,REXX,CA7,JCL,CHANGEMAN,ENDEVOR,FILEMANAGER,FILEAID,DB2,DEBUG TOOL,EXPEDITOR

We'd love your feedback!