We provide IT Staff Augmentation Services!

Sr .hadoop/spark Developer Resume

2.00/5 (Submit Your Rating)

San Antonio, TX

SUMMARY

  • 6 years of overall IT experience in a variety of industries, which includes hands on experience of 4+ years in Big Data Analytics and development and 2 years in ETL Data Warehousing Development
  • Expertize with the tools in Hadoop Ecosystem including Pig, Hive, HDFS, MapReduce, Sqoop, Spark, Kafka, Yarn, Oozie, and Zookeeper.
  • Excellent knowledge on Hadoop ecosystems such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm
  • Experience in designing and developing applications in Spark using Scala to compare the performance of Spark with Hive and SQL/Oracle.
  • Experience in manipulating/analyzing large datasets and finding patterns and insights within structured and unstructured data.
  • Strong experience on IBM BigInshights Hadoop distribution
  • Good understanding of NoSQL databases and hands on work experience in writing applications on NoSQL databases like HBase, Cassandra
  • Experienced in writing complex MapReduce programs that work with different file formats like Text, Sequence, Xml, parquet and Avro.
  • Experience in migrating the data using Sqoop from HDFS to Relational Database System and vice - versa according to client's requirement.
  • Extensive Experience on importing and exporting data using stream processing platforms like Flume and Kafka.
  • Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.
  • Strong Experience of Data Warehousing ETL concepts using Datastage Informatica Power Center, OLAP, OLTP and AutoSys.
  • Experience in database design using PL/SQL to write Stored Procedures, Functions, Triggers and strong experience in writing complex queries for Oracle..
  • Strong experience in Object-Oriented Design, Analysis, Development, Testing and Maintenance.
  • Experienced in using agile approaches, including Extreme Programming, Test-Driven Development and Agile Scrum.
  • Worked in large and small teams for systems requirement, design & development.
  • Key participant in all phases of software development life cycle with Analysis, Design, Development, Integration, Implementation, Debugging, and Testing of Software Applications in client server environment, Object Oriented Technology and Web based applications.
  • Experience in using various IDEs Eclipse, IntelliJ and repositories SVN and Git.
  • Experience of using build tools Sbt, Maven.
  • Preparation of Standard Code guidelines, analysis and testing documentations.
  • Domain knowledge in insurance

TECHNICAL SKILLS

Languages: C, Java, Scala, Python, SQL, PL/SQL, Pig Latin, HiveQL, Java Script, Shell Scripting

Java & J2EE: Core Java, Servlets, Hibernate, Spring, Struts, JMS, EJBTechnologies

Application Servers: Web Logic, Web Sphere, JBoss, Tomcat

Tools: IBM Unica, Datastage 7.x/8.x/9.1 (Administrator, Designer, Director, Manager),Informatica 8.6/9.1

Databases: Oracle 10g/9i/8i, MS SQL server 2000/2005, DB2 9/10, Netezza, IMS

Operating systems: Windows, UNIX and Mainframes z/OS

Bigdata: IBM biginsights, Apache Hadoop, Pig, Hive, Hbase, jaql

Reporting tools: SAP Business objects WebiScheduler: Control-M

PROFESSIONAL EXPERIENCE

Confidential, San Antonio, TX

Sr .Hadoop/Spark developer

Responsibilities:

  • Designed & developed custom Map Reduce Job to ingest Click-stream Data received from Adobe & IVR Data received from Nuance into Hadoop
  • Developed custom Map Reduce Job to perform data cleanup, transform Data from Text to Avro & write output directly into hive tables by generating dynamic partitions
  • Developed Custom FTP & SFTP drivers to pull flat files from UNIX, Windows into Hadoop & tokenize an identified sensitive data from input records on the fly parallel
  • Developed Custom Input Format, Record Reader, Mapper, Reducer, Partitioner as part of developing end to end hadoop applications
  • Developed Custom SQOOP tool to import data residing in any relational databases, tokenize an identified sensitive column on the fly and store it into hadoop
  • Worked on Hbase Java API to populate operational Hbase table with Key value
  • Experience in writing Spark Applications using spark-shell, pyspark, spark-submit
  • Developed prototype Spark applications using Spark-Core, Spark SQL, DataFrame API
  • Developed several custom User defined functions in Hive & Pig using Java & python
  • Used Hive Partitioning, bucketing and perform different type of joins on Hive Tables
  • Developed SQOOP Job to perform import/ Incremental Import of data from any relational tables into hadoop in different formats such as text,avro, sequence, etc & into hive tables
  • Experience in using HCatalog to load Text Format Hive Table and transform data into Avro within Pig script
  • Worked on Custom Hadoop InputFormat to read Fixed Length ASCII files
  • Developed applications to ingest data into Hadoop using Flume
  • Improving the performance and optimization of the existing algorithms in Hadoop using Spark-SQL and Data Frame API.
  • Developed java code to generate,compare & merge avro schema files
  • Designed & developed External & Managed Hive tables with data formats such as Text,Avro, Sequence File, RC, ORC, parquet
  • Designed & developed static, dynamic partitioned, bucketed hive tables
  • Designed & developed Hive queries & pig scripts to build Type 1,Type 2 dimensions and Facts & tuned performance of Pig scripts & Hive queries

Environment: Hadoop Ecosystem,Spark SQL,Spark Streaming,DataStage 9.1, DB2-squirrel client, Oracle 11g, Netezza,Python,Java,Control-M, UNIX Shell Scripting.

Confidential, San Antonio, TX

Sr. ETL Sr .Hadoop developer

Responsibilities:

  • Creating high level flow diagram and design document
  • Developed Custom FTP & SFTP drivers to pull flat files from UNIX, Windows into Hadoop
  • Extracting Member Service Representative data from Clicktrail and Splunk
  • Creating Source to target mappings and validated use cases.
  • Planning project timeline and monitoring the execution and results.
  • ETL datastage job design and unit testing
  • Integrating with teams on system testing, bug/defect fixing and maintaining the test log.
  • Control-M job scheduling and warranty support.
  • Transistioning application functioning and documents to O&M team

Environment: Hadoop Ecosystem, DataStage 9.1, DB2, Oracle11g,Control-M, UNIX Shell Scripting Confidential, Chennai, TXApril-2013 to Dec-2014

Developer(Hadoop Ecosystem,ETL Datastage, BO reports)

Confidential

Responsibilities:

  • Gathering requirements, Preparing Specifications for Hadoop and ETL Jobs.
  • Working with DBA on data model and table changes
  • Working with team on ETL and BO dashboards and reports.
  • Creating Source to target mappings and validated use cases.
  • Coordinated with data model team and Involved in dimension data modeling.
  • Creation Dimension and Fact tables to load Asset and Lifecycle and loaded into Netezza target table.
  • Design and develop the reusable component to load the data from data set to Netezza table.
  • Planning project timeline and monitoring the execution and results.
  • Integrating with teams on system testing, bug/defect fixing and maintaining the test log.
  • Control-M job scheduling and warranty support.

Environment: Haoop Ecosystem,DataStage 9.1, DB2, DB2-squirrel client, Netezza database, Sqlserver 2008,WinScp,Control-M, UNIX Shell Scripting, UNIX, BO Webi

Confidential, Chennai, TX

Datastage Developer

Responsibilities:

  • IT Onsite coordinator for Client- Confidential in Property & Causality Insurance area of business for a team of 12 resources at offshore Chennai, India.
  • Working on maintenance & enhancements of the current EDW(Enterprise Data Ware House).
  • I am acting as the defect queue lead simultaneously working on fixing the day to day defects & preventatives coming across the Property & causality insurance domain services.
  • Directly involved in DB2 to IBM Netezza database migration which was a huge success around different LOB’s @ Confidential .
  • Working on multiple small mod projects in Agile & water fall methodologies based on the requirement. In this process under going daily Agile stand up ceremonies.
  • Working on different change requests (CR’s), used my knowledge extensively in fixing various defects related to ETL- Informatica code changes, UNIX scripts implementation & database changes by data fixing the historical incorrect data.
  • Working on RTC (Rational team concert) for checking in the changed code & scripts for maintaining version control and logging in the different work items.
  • Scheduling & Monitoring Control M cycles while the changes are in warranty period.

Environment: DataStage 8.1, DB2, DB2-squirrel client, Netezza database, Sqlserver 2008,WinScp,Control-M, UNIX Shell Scripting

Confidential, San antonio TX

Sr. Developer(ETL- Datastage, Informatica)

Responsibilities:

  • Production support and job fixes
  • Performance improvement and job monitoring
  • Job Inventory sheet for DataStage 8.1 and 9.1 migration.
  • Updating the jobs, Scripts&Parm files to Confidential naming standards.
  • Implementing the Confidential ETL frame work for all jobs.
  • Removing of unused job parameters in all jobs
  • Creation of new control M job names and script names.
  • Coordinate with team members to expedite the process and deliverables.

Environment: DataStage 9.1, Data Stage8.1, Informatica, Oracle 9i, DB2, SQL server, Netezza, FTP tool, Winscp, UNIX

We'd love your feedback!