We provide IT Staff Augmentation Services!

Big Data Engineer Resume

2.00/5 (Submit Your Rating)

Bronx, NY

SUMMARY:

  • Around 9+ years of experience in Information technology and in fields of Developing and Testing in Java/J2EE technology, expertise in development of Hadoop/BigData, web based technologies with different back end databases.
  • Good Knowledge and exposure inBigDataprocessing usingHadoopEcosystem including Pig, Hive, HDFS, Map Reduce (MRV1 and YARN), Sqoop, Flume, Kafka, Oozie, Zookeeper, Spark, Impala.
  • Experience in Cloudera, HortonWorks, MapR and Amazon Web Services distributions ofHadoop.
  • Experience in installing, configuring and using ecosystem components likeHadoopMapReduce, Sqoop, Pig, Hive, Hbase and HDBS Impala& Spark.
  • Good Knowledge and exposure inHadooparchitecture and various components such as HDFS, Job Tracker, Name Node,DataNode, Task Tracker.
  • Experience in working with java for writing custom UDFs to extend Hive and Pig core functionality.
  • Good understanding of NoSQL databases and hands on experience with Apache HBase.
  • Expertise in transferringdatabetween aHadoopecosystem and structureddatastorage in a RDBMS such as MY SQL, Oracle, Teradata and DB2 using Sqoop.
  • Extensive experience in Oracle database design, application development and in - depth knowledge of SQL and PL/SQL
  • Expertise in various Java/J2EE technologies like JSP, Servlets, Hibernate, Struts, spring.
  • Experience in using Oozie, ControlM and Autosys workflow engine for managing and scheduling Hadoop Jobs
  • Experience in Software Development Life Cycles (SDLC) like Waterfall Model, and Agile methodologies which include Test Driven Development, SCRUM and Pair Programming.
  • Good knowledge with web-based UI development using jQuery UI, jQuery, ExtJS, CSS3, HTML, HTML5, XHTML and JavaScript.
  • Experience with unit testing, functional Testing, system Testing, Integration testing the applications using JUnit, Mockito, Jasmine and Cucumber, PowerMock & EasyMock.
  • Experience in using IDEs like Eclipse, Visual Studio and experience in DBMS like Oracle and MYSQL.
  • Experience in working Windows and Linux Based Operating systems like Windows 7/8, Ubuntu, CentOS and Fedora.

TECHNICAL SKILLS:

Hadoop/BigDataTechnologies: ApacheHadoop, HDFS and Map Reduce, Pig, Hive, Sqoop, Flume, Hue, HBase, YARN, Oozie, Zookeeper, MapR ConvergedDataPlatform, Apache Spark, Apache Kafka

Web Technologies: JavaScript, HTML, CSS, XML,AJAX,SOAP

MVC Frameworks: Spring, Hibernate, Struts

Languages: JAVA, PYTHON, C, C++, SQL, PL/SQL, Ruby, Bash and Perl

SQL/NOSQL Databases: Apache HBase, MongoDB, Cassandra, MS SQL Server, MYSQL

Application Server: Web Logic, Web Sphere, Apache Tomcat & JBoss

Testing Frameworks: JUnit, Mockito, PowerMock, EasyMock, Jasmine, Cucumber Version Control Git, Subversion, CVS, Clearcase

Documentation Tools: MS Office, iWorks, MS Project, MS SharePoint

Operating Systems: Windows, Mac OS

PROFESSIONAL EXPERIENCE

Confidential, Bronx, NY

Big Data Engineer

Responsibilities:

  • Responsible to work with Business stakeholder and translate Business objectives, requirements into technical requirements and design
  • Involved in loading and transforming large sets of structured, semi structured and unstructureddata from multiple source system to MacysHadoopDataLake.
  • Developed a process for Sqoopingdatafrom multiple sources like SQL Server, Oracle and Teradata, DB2.
  • Migrated all the SQL sources toHadoopData Lake while loading or moving thedatainto one or more Staging, Landing and Semantic logical Layers with the same schema as source.
  • Developed the Scala wrappers to generate HiveQL scripts to load or move thedatabetween different logical layers of HiveHadoopData Lake.
  • Involved in creating Hive tables, loadingdataand writing hive queries as per business requirements.
  • Performeddatatransformation in Hive, Spark SQL.
  • Implemented partitioning and bucketing ofdatain Hive for improving the performance
  • Developed and Supported Map Reduce programs those are running on the cluster.
  • Involved in writing both DML and DDL operations in NoSQL database Cassandra
  • Developed analytical components using Scala, Spark and Spark Streaming.
  • Implemented Flume, Spark, and Spark Streaming framework for real timedataprocessing.
  • Developed proto type forBigDataanalysis using Spark, RDD,DataFrames andHadoopeco system with .csv, Json, parquet and hdfs files.
  • Involved in developing Spark code using Scala and Spark-SQL for faster testing and processing of dataand exploring of optimizing it using Spark Context, Spark-SQL, Pair RDD's, Spark YARN.
  • Wrote programs in Scala using Spark and worked on migrating MapReduce programs into Spark using Scala
  • Responsible for creation of Source to Target mapping document from source fields to destination fields mapping.
  • Developed a shell script to create staging, landing and Semantic tables with the same schema like the source
  • Developed HiveQL scripts for performing transformation logic and also loading thedatafrom staging zone to landing zone and Semantic zone.
  • Responsible for Debug, Optimization of Hive Scripts
  • Automated all the jobs for pullingdatafrom FTP server or SQL Sources to loaddatainto Hive tables using Control-M Jobs.
  • Created HBase tables to store the final aggregateddatafromHadoopsystem.
  • Generated reports for hive tables in different scenarios using Tableau.

Environment:HDP 2.2.4.2, Hive, Pig, Oozie, Sqoop, Flume, Spark, Spark SQL, Scala, HBase, Cassandra, SAP HANA, SAP BODS, Tableau

Confidential, Gardner, KS

Sr. Big Data Engineer

Responsibilities:

  • Worked on analyzingHadoopcluster using differentbigdataanalytic tools including Kafka, Pig, Hive and Map Reduce.
  • Configured Spark streaming to receive real timedatafrom the Kafka and store the streamdatato HDFS using Scale.
  • Worked on implementing Spark using Scala and Sparksql for faster analyzing and processing of data.
  • Handled in Importing and exportingdatainto HDFS and Hive using SQOOP and Kafka
  • Involved in creating Hive tables, loading thedataand writing hive queries, which will run internally in map reduce.
  • Worked on Designing and Developing ETL Workflows using Java for processingdatain HDFS/Hbase using Oozie.
  • Worked on importing the unstructureddatainto the HDFS using Flume.
  • Wrote complex Hive queries and UDFs.
  • Exporteddatafrom HDFS environment into RDBMS using Sqoop for report generation and visualization purpose.
  • Created and maintained Technical documentation for launchingHadoopClusters and for executing Hive queries and Pig Scripts.
  • Used Flume extensively in gathering and moving logdatafiles from Application Servers to a central location inHadoopDistributed File System (HDFS).
  • Involved in developing Shell scripts to easy execution of all other scripts (Pig, Hive, and MapReduce) and move thedatafiles within and outside of HDFS.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala.
  • Worked with cloud services like Amazon Web Services (AWS) and involved in ETL,DataIntegration and Migration.
  • Worked with NoSQL databases like Hbase, Cassandra in creating tables to load large sets of semi structureddata.
  • Generated Java APIs for retrieval and analysis on No-SQL database such as HBase and Cassandra
  • Worked on loadingdatafrom UNIX file system to HDFS
  • Analyzed Cassandra database and compare it with other open-source NoSQL databases to find which one of them better suites the current requirements.
  • Analyzed large amounts ofdatasets to determine optimal way to aggregate and report on it.

Environment: Hadoop, HDFS, MapReduce, Hive Sqoop, Hbase, Apache Spark, Oozie Scheduler, Java, UNIX Shell Scripts, Kafka, Git, Maven, PLSQL, Python, Scala, Cloudera

Confidential, Columbus, OH

Sr. Big Data/Hadoop Developer

Responsibilities:

  • Worked in the BI team in the area ofBigDataHadoopcluster implementation anddataintegration in developing large-scale system software..
  • Developed MapReduce programs to parse the rawdata, populate staging tables and store the refineddatain partitioned tables in the EDW.
  • Worked extensively with Sqoop for importing and exporting thedatafrom HDFS to Relational Database systems/mainframe and vice-versa. Loadingdatainto HDFS.
  • Captureddatafrom existing databases that provide SQL interfaces using Sqoop.
  • Created Hive queries that helped market analysts spot emerging trends by comparing freshdatawith EDW reference tables and historical metrics.
  • Enabled speedy reviews and first mover advantages by using Oozie to automatedataloading into theHadoopDistributed File System and PIG to pre-process thedata.
  • Provided design recommendations and thought leadership to sponsors/stakeholders that improved review processes and resolved technical problems.
  • Worked on importing and exportingdatafrom Oracle and DB2 into HDFS and HIVE using Sqoop.
  • Created alter, insert and delete queries involving lists, sets and maps in DataStax Cassandra.
  • Responsible for architectingHadoopclusters with CDH4 on CentOS, managing with Cloudera Manager.
  • ManagedHadoopjobs using Oozie workflow scheduler system for Map Reduce, Hive, Pig and Sqoop actions.
  • Involved in initiating and successfully completing Proof of Concept on FLUME for Pre-Processing, Increased Reliability and Ease of Scalability over traditional MSMQ.
  • Used Flume to collect the logdatafrom different resources and transfer thedatatype to hive tables using different SerDe to store in JSON, XML and Sequence file formats.
  • Worked with application teams to install operating system,Hadoopupdates, patches, version upgrades as required.
  • Supported in settling up QA environment and updating configuration for implementing scripts with Pig and Sqoop.
  • Implemented testing scripts to support test driven development and continuous integration.

Environment: Hadoop, MapReduce, HDFS, Hive, Java (jdk1.6),Hadoopdistribution of HortonWorks, Cloudera, MapR, DataStax, IBM DataStage 8.1(Designer, Director, Administrator), PL/SQL, SQL*PLUS, Toad 9.6, Windows NT, UNIX Shell Scripting

Confidential, Arlington, VA

Sr. Java/J2EE Developer

Responsibilities:

  • Developed the application using Struts Framework that leverages classical Model View Layer (MVC) architecture UML diagrams like use cases, class diagrams, interaction diagrams and activity diagrams
  • Participated in requirement gathering and converting the requirements into technical specifications
  • Extensively worked on User Interface for few modules using JSPs, JavaScript and Ajax
  • Created Business Logic using Servlets, Session beans and deployed them on Web logic server.
  • Developed the XML Schema and Web services for the data maintenance and structures
  • Implemented the Web Service client for the login authentication, credit reports and applicant information using Apache Axis 2 Web Service.
  • Developed workflows using custom MapReduce, Pig, Hive and Sqoop.
  • Built reusable Hive UDF libraries for business requirements which enabled users to use these UDF's in Hive Querying.
  • Developed a data pipeline using Kafka and Storm to store data into HDFS.
  • Maintained third party software, and database(s) with updates/upgrades, performance tuning and monitoring
  • Developed multiple MapReduce jobs injavafor data cleaning and preprocessing.
  • Created UDFs to calculate the pending payment for the given Residential or Small Business customer, and used in Pig and Hive Scripts.
  • Responsible to manage data coming from different sources.
  • Developed Shell, Perl and Python scripts to automate and provide Control flow to Pig scripts.
  • Used Hibernate ORM framework with spring framework for data persistence and transaction management.
  • Wrote test cases in Junit for unit testing of classes
  • Involved in templates and screens in HTML and JavaScript
  • Involved in integrating Web Services using WSDL and UDDI
  • Built and deployedJavaapplications into multiple Unix based environments and produced both unit and functional test results along with release notes

Environment: JDK 1.5, J2EE 1.4, Struts 1.3, Kafka, Storm JSP, Servlets 2.5, WebSphere 6.1, HTML, XML, ANT 1.6, Perl, Python, JavaScript, Junit 3.8

Confidential

Sr. Java/J2EE Developer

Responsibilities:

  • Designed and developed the application using Agile methodology.
  • Implementation of new module development, new change requirement, fixes the code. Defect fixing for defects identified in pre-production environments and production environment.
  • Wrote technical design document with class, sequence, and activity diagrams in each use case.
  • Created Wiki pages using Confluence Documentation.
  • Developed various reusable helper and utility classes which were used across all modules of the application.
  • Involved in developing XML compilers using XQuery.
  • Developed the Application using Spring MVC Framework by implementing Controller, Service classes.
  • Involved in writing Spring Configuration XML file that contains declarations and other dependent objects declaration.
  • Used Hibernate for persistence framework, Involved in creating DAO's and used Hibernate for ORM mapping.
  • WrittenJavaclasses to test UI and Web services through JUnit.
  • Performed functional and integration testing, extensively involved in release/deployment related critical activities. Responsible for designing Rich user Interface Applications using JSP, JSP Tag libraries, Spring Tag libraries, JavaScript, CSS, HTML
  • Used SVN for version control. Log4J was used to log both User Interface and Domain Level Messages.
  • Used Soap UI for testing the Web services.
  • Use of MAVEN for dependency management and structure of the project
  • Create the deployment document on various environments such as Test, QC, and UAT.
  • Involved in system wide enhancements supporting the entire system and fixing reported bugs.
  • Explored Spring MVC, SpringIOC, Spring AOP and Hibernate in creating the POC.

Environment: Java, J2EE, JSP, Spring, Hibernate, CSS, JavaScript, Oracle, JBoss, Maven, Eclipse, JUnit, Log4J, AJAX, Web services,JNDI, JMS, HTML, XML, XSD, XML Schema.

We'd love your feedback!