We provide IT Staff Augmentation Services!

Big Data Engineer Resume

3.00/5 (Submit Your Rating)

Pleasanton, CA

SUMMARY:

  • JAVA: 12 Years, BIG DATA/HADOOP: 2 Years)12+ years of professional experience in Analysis, Design & Development of Applications in Core Java/J2EE with development in Big Data ecosystem like Spark, Kafka, Hive, MapReduce, HDFS, ZooKeeper, Scoop, Yarn, Oozie, Avro/Parquet/ORC technologies.
  • Proficient in Object - oriented programming (OOP) principles, applying proven design patterns or algorithm’s in building high performance applications using Java.
  • Good working experience on Spark framework on both batch and real time data processing.
  • Strong working experience on Cloudera and Hortonworks Hadoop distributions.
  • Extensively worked on Structured Streaming of Spark 2.1.1.
  • Strong work experience on Kafka streaming to fetch the data real time or near real time.
  • Experience data processing like collecting, aggregating, moving from various sources using Kafka.
  • Experience in extracting source data from log files, flat files, excel files, transforming and loading it into the target Data warehouse.
  • Experience in migrating the data using Scoop from HDFS to Relational Database System and vice-versa according to the requirement.
  • Experience in using AVRO, Parquet and ORC file formats.
  • Experience on Azure Auto Scaling and provisioning using resource management (RM) templates.
  • Worked on IAAS and PAAS software/component installation in Azure cloud.
  • Experience in Hortonworks installation, version upgradation and management of Hadoop clusters.
  • Experience working in agile development environment including Scrum methodology.
  • Extensively worked on build tools like Maven, Ant and Jenkins.
  • Worked on version control tools like CVS, Stash, GIT and SVN.
  • Worked with BI (Business Intelligence) teams in generating the reports and designing ETL workflows. Deployed data from various sources into HDFS and building reports.
  • Good working knowledge skills in Spring, Struts framework, Servlets, JSP, Hibernate and Web services.
  • Good working knowledge practices in Life Science, Healthcare and Insurance.
  • Extensive usage of SQL, PL/SQL, Stored Procedures and Views in Oracle 9i, 10g and 11g.
  • Good inter-personal and communication skills with the ability to communicate effectively at all levels of development process.
  • Ability to learn quickly and adapt to new technical environments.

TECHNICAL SKILLS:

Highlights: Java 1.8/1.7, Big Data, Spark 2.1.1, Kafka 0.10, Hive, MapReduce, HDFS, ZooKeeper, Yarn, Oozie, Scoop 1.4, Avro/Parquet/ORC, Cloudera, Hortonworks, Azure, Tomcat 7 & 8, Jetty Web 4.2.27, MYSQL, Oracle 9i & 10gBig Data Ecosystem: Spark, Kafka, HDFS, ZooKeeper, MapReduce, Oozie, Scoop, Yarn, Avro, Parquet, ORC and Hive

Languages: Java 8.x/7.x, SQL/PLSQL, HTML, DHTML, JavaScript, XML

Web Technologies: J2EE, Struts 2, Spring 3.1, Servlets, JSP, SOAP, WSDL, SOA, Java Script, Ajax, EXTJS, XML, HTML, REST

Databases: Oracle 9i & 10g,11g, SQL Server 2005, Netezza, MYSQL 5

Linux Distributions: CentOS, Redhat

Version Controls: SVN, CVS, Stash, GIT

Build Tools: ANT, Maven, Jenkins

Operating systems: UNIX/Linux, Windows

Web Servers/ Application Server: Apache web server, Apache Tomcat, Jetty Web, Spring Boot

Frameworks: MVC Architecture, J2EE Design Patterns, Jakarta Struts, Hibernate 4.1, Spring, Hadoop

ETL Tools: Spark, BODS, Pentaho

Tools: Eclipse, Enterprise Architect 6.5, Lisa, Xcelsius, Soap UI, BOBJ, Alfresco

PROFESSIONAL EXPERIENCE:

Confidential, Pleasanton, CA

Big Data Engineer

Responsibilities:

  • Create/Implement robust data ingestion framework(s) that allow for the expedited migration of numerous disparate data sources into HDFS.
  • Development of Spark structured streaming to read the data from Kafka in real time and batch modes, apply different mode of Change data captures (CDCs) and then load the data into Hive.
  • Developed File watcher using NIO API and Notification services using Kafka.
  • Developed different services checksum, copying files to HDFS, encryption and decryption, compression, duplication and zipping.
  • Data offset is maintained while reading data from Kafka to maintain data consistency and integrity.
  • Developed common message model for the data receiving pipeline.
  • Developed independent services for configurations (Kafka, System and Database), producer manifest and data manifest using spring boot rest API.
  • Record conversion from multiline records to single line records using map reduce jobs.
  • Worked on converting schema to Avro schema using Apache Avro.
  • Developed Kafka producer and consumer to send and receive messages.
  • Worked/Consumed on schema Registry to register/apply/build schema using Avro format for Hive external tables.

Environment: Big Data, Spark, Kafka, Java, HDFS, ZooKeeper, MapReduce, Yarn, Avro, Hive, Spring boot, Rest API, Jackson API, MYSQL, Maven, Jetty, Stash/GIT, Azure, Cloudera, Golden Gate.

Confidential, San Francisco, CA

Technical Lead / Big Data Engineer

Responsibilities:

  • Developed Scoop logic to pull data from data warehouse (CPD+, EDW, MTV, ODS) in to Hadoop Landing area in the form of Parquet file format and creates external hive tables on top of it.
  • Developed Spark jobs to transform and load data to stage and target Database
  • Developed Oozie to schedule the spark jobs to load and generate the reports and finally store it in HDFS for internal and external users to access though the applications.
  • Worked with end users to evaluate business and technical requirements.
  • Developed rest services for the reporting functionalities.
  • Designed data model for report scheduler stand alone and reporting application.
  • Team Management, co-ordination and responsible for the deliverables to the Client.
  • Research of new framework and developing functional features in the application.
  • Perform peer review of the application developed by the team using techniques such as unit testing, system testing and code inspections
  • Writing cronjobs and shell scripts for the automation
  • Release management and deployment on DEV, MOT/UAT and Production servers.

Environment: Java 1.7, Big Data, Scoop, Hive, Impala, Oozie, HDFS, Cloudera, Spark 1.5, J2EE, JSP, Spring 3.1, Hibernate 4.1, JavaScript, Ajax, Web services, Oracle 11G, Tomcat 7, Velocity, BO Webi, BODS Angular JS, Maven.

Confidential

Technical Lead/Developer

Responsibilities:

  • Working with end users to evaluate business and technical requirements.
  • Responsible for the deliverables to the Client.
  • Developed analytical dashboards using Fluid Analytics Engine (FAE)
  • Designed underlying Data Model using Neteza database.
  • Team Management and co-ordination.
  • Perform peer review of the application developed by the team using techniques such as unit testing, system testing and code inspections
  • Release management and deployment.

Environment: Java 1.7, J2EE, JSP, Spring 3.1, Hibernate 4.1, JavaScript, Ajax, Web services, Netezza, Tomcat 7

Confidential, San Jose CA

Technical Lead

Responsibility:

  • Developed Spring rest services for the application functionalities.
  • Automation of the kettle jobs of ETL process.
  • Writing cronjobs and shell scripts for the automation.
  • Designed data model for the application.
  • Working with end users to evaluate business and technical requirements.
  • Responsible for the deliverables to the Client.
  • Team Management and co-ordination.
  • Research of new framework and developing functional features in the application..
  • Adding new functionalities to the application.
  • Perform peer review of the application developed by the team using techniques such as unit testing, system testing and code inspections
  • Release management and deployment.

Environment: Java 1.7, J2EE, JSP, Spring 3.1, Hibernate 4.1, JavaScript, Ajax, Web services, MySQL 5, Tomcat 7, Velocity, Pentaho (ETL).

Confidential, San Francisco CA

Senior Software Engineer

Responsibilities:

  • Integration of two applications (.net and Java) (OMS—DMS Integration)
  • Knowledge tree tool support.
  • Migration of production servers to the higher end configuration servers.
  • Https Implementation.
  • Coordinating offshore and onsite team.
  • Research of new framework to be incorporated
  • Developing functional features.
  • Issue resolution.

Environment: Java, J2EE, JSP, JavaScript, PHP, Ajax, Struts, MySQL, Tomcat 6.0.18, Apache 2.2.11, KnowledgeTree 3.6.1, MS SQLServer 2005.

We'd love your feedback!