Big Data Engineer Resume Bronx, NY - Hire IT People

SUMMARY:

Around 9+ years of experience in Information technology and in fields of Developing and Testing in Java/J2EE technology, expertise in development of Hadoop/BigData, web based technologies with different back end databases.
Good Knowledge and exposure inBigDataprocessing usingHadoopEcosystem including Pig, Hive, HDFS, Map Reduce (MRV1 and YARN), Sqoop, Flume, Kafka, Oozie, Zookeeper, Spark, Impala.
Experience in Cloudera, HortonWorks, MapR and Amazon Web Services distributions ofHadoop.
Experience in installing, configuring and using ecosystem components likeHadoopMapReduce, Sqoop, Pig, Hive, Hbase and HDBS Impala& Spark.
Good Knowledge and exposure inHadooparchitecture and various components such as HDFS, Job Tracker, Name Node,DataNode, Task Tracker.
Experience in working with java for writing custom UDFs to extend Hive and Pig core functionality.
Good understanding of NoSQL databases and hands on experience with Apache HBase.
Expertise in transferringdatabetween aHadoopecosystem and structureddatastorage in a RDBMS such as MY SQL, Oracle, Teradata and DB2 using Sqoop.
Extensive experience in Oracle database design, application development and in - depth knowledge of SQL and PL/SQL
Expertise in various Java/J2EE technologies like JSP, Servlets, Hibernate, Struts, spring.
Experience in using Oozie, ControlM and Autosys workflow engine for managing and scheduling Hadoop Jobs
Experience in Software Development Life Cycles (SDLC) like Waterfall Model, and Agile methodologies which include Test Driven Development, SCRUM and Pair Programming.
Good knowledge with web-based UI development using jQuery UI, jQuery, ExtJS, CSS3, HTML, HTML5, XHTML and JavaScript.
Experience with unit testing, functional Testing, system Testing, Integration testing the applications using JUnit, Mockito, Jasmine and Cucumber, PowerMock & EasyMock.
Experience in using IDEs like Eclipse, Visual Studio and experience in DBMS like Oracle and MYSQL.
Experience in working Windows and Linux Based Operating systems like Windows 7/8, Ubuntu, CentOS and Fedora.

TECHNICAL SKILLS:

Hadoop/BigDataTechnologies: ApacheHadoop, HDFS and Map Reduce, Pig, Hive, Sqoop, Flume, Hue, HBase, YARN, Oozie, Zookeeper, MapR ConvergedDataPlatform, Apache Spark, Apache Kafka

Web Technologies: JavaScript, HTML, CSS, XML,AJAX,SOAP

MVC Frameworks: Spring, Hibernate, Struts

Languages: JAVA, PYTHON, C, C++, SQL, PL/SQL, Ruby, Bash and Perl

SQL/NOSQL Databases: Apache HBase, MongoDB, Cassandra, MS SQL Server, MYSQL

Application Server: Web Logic, Web Sphere, Apache Tomcat & JBoss

Testing Frameworks: JUnit, Mockito, PowerMock, EasyMock, Jasmine, Cucumber Version Control Git, Subversion, CVS, Clearcase

Documentation Tools: MS Office, iWorks, MS Project, MS SharePoint

Operating Systems: Windows, Mac OS

PROFESSIONAL EXPERIENCE

Confidential, Bronx, NY

Big Data Engineer

Responsibilities:

Responsible to work with Business stakeholder and translate Business objectives, requirements into technical requirements and design
Involved in loading and transforming large sets of structured, semi structured and unstructureddata from multiple source system to MacysHadoopDataLake.
Developed a process for Sqoopingdatafrom multiple sources like SQL Server, Oracle and Teradata, DB2.
Migrated all the SQL sources toHadoopData Lake while loading or moving thedatainto one or more Staging, Landing and Semantic logical Layers with the same schema as source.
Developed the Scala wrappers to generate HiveQL scripts to load or move thedatabetween different logical layers of HiveHadoopData Lake.
Involved in creating Hive tables, loadingdataand writing hive queries as per business requirements.
Performeddatatransformation in Hive, Spark SQL.
Implemented partitioning and bucketing ofdatain Hive for improving the performance
Developed and Supported Map Reduce programs those are running on the cluster.
Involved in writing both DML and DDL operations in NoSQL database Cassandra
Developed analytical components using Scala, Spark and Spark Streaming.
Implemented Flume, Spark, and Spark Streaming framework for real timedataprocessing.
Developed proto type forBigDataanalysis using Spark, RDD,DataFrames andHadoopeco system with .csv, Json, parquet and hdfs files.
Involved in developing Spark code using Scala and Spark-SQL for faster testing and processing of dataand exploring of optimizing it using Spark Context, Spark-SQL, Pair RDD's, Spark YARN.
Wrote programs in Scala using Spark and worked on migrating MapReduce programs into Spark using Scala
Responsible for creation of Source to Target mapping document from source fields to destination fields mapping.
Developed a shell script to create staging, landing and Semantic tables with the same schema like the source
Developed HiveQL scripts for performing transformation logic and also loading thedatafrom staging zone to landing zone and Semantic zone.
Responsible for Debug, Optimization of Hive Scripts
Automated all the jobs for pullingdatafrom FTP server or SQL Sources to loaddatainto Hive tables using Control-M Jobs.
Created HBase tables to store the final aggregateddatafromHadoopsystem.
Generated reports for hive tables in different scenarios using Tableau.

Environment:HDP 2.2.4.2, Hive, Pig, Oozie, Sqoop, Flume, Spark, Spark SQL, Scala, HBase, Cassandra, SAP HANA, SAP BODS, Tableau

Confidential, Gardner, KS

Sr. Big Data Engineer

Responsibilities:

Worked on analyzingHadoopcluster using differentbigdataanalytic tools including Kafka, Pig, Hive and Map Reduce.
Configured Spark streaming to receive real timedatafrom the Kafka and store the streamdatato HDFS using Scale.
Worked on implementing Spark using Scala and Sparksql for faster analyzing and processing of data.
Handled in Importing and exportingdatainto HDFS and Hive using SQOOP and Kafka
Involved in creating Hive tables, loading thedataand writing hive queries, which will run internally in map reduce.
Worked on Designing and Developing ETL Workflows using Java for processingdatain HDFS/Hbase using Oozie.
Worked on importing the unstructureddatainto the HDFS using Flume.
Wrote complex Hive queries and UDFs.
Exporteddatafrom HDFS environment into RDBMS using Sqoop for report generation and visualization purpose.
Created and maintained Technical documentation for launchingHadoopClusters and for executing Hive queries and Pig Scripts.
Used Flume extensively in gathering and moving logdatafiles from Application Servers to a central location inHadoopDistributed File System (HDFS).
Involved in developing Shell scripts to easy execution of all other scripts (Pig, Hive, and MapReduce) and move thedatafiles within and outside of HDFS.
Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala.
Worked with cloud services like Amazon Web Services (AWS) and involved in ETL,DataIntegration and Migration.
Worked with NoSQL databases like Hbase, Cassandra in creating tables to load large sets of semi structureddata.
Generated Java APIs for retrieval and analysis on No-SQL database such as HBase and Cassandra
Worked on loadingdatafrom UNIX file system to HDFS
Analyzed Cassandra database and compare it with other open-source NoSQL databases to find which one of them better suites the current requirements.
Analyzed large amounts ofdatasets to determine optimal way to aggregate and report on it.

Environment: Hadoop, HDFS, MapReduce, Hive Sqoop, Hbase, Apache Spark, Oozie Scheduler, Java, UNIX Shell Scripts, Kafka, Git, Maven, PLSQL, Python, Scala, Cloudera

Confidential, Columbus, OH

Sr. Big Data/Hadoop Developer

Responsibilities:

Worked in the BI team in the area ofBigDataHadoopcluster implementation anddataintegration in developing large-scale system software..
Developed MapReduce programs to parse the rawdata, populate staging tables and store the refineddatain partitioned tables in the EDW.
Worked extensively with Sqoop for importing and exporting thedatafrom HDFS to Relational Database systems/mainframe and vice-versa. Loadingdatainto HDFS.
Captureddatafrom existing databases that provide SQL interfaces using Sqoop.
Created Hive queries that helped market analysts spot emerging trends by comparing freshdatawith EDW reference tables and historical metrics.
Enabled speedy reviews and first mover advantages by using Oozie to automatedataloading into theHadoopDistributed File System and PIG to pre-process thedata.
Provided design recommendations and thought leadership to sponsors/stakeholders that improved review processes and resolved technical problems.
Worked on importing and exportingdatafrom Oracle and DB2 into HDFS and HIVE using Sqoop.
Created alter, insert and delete queries involving lists, sets and maps in DataStax Cassandra.
Responsible for architectingHadoopclusters with CDH4 on CentOS, managing with Cloudera Manager.
ManagedHadoopjobs using Oozie workflow scheduler system for Map Reduce, Hive, Pig and Sqoop actions.
Involved in initiating and successfully completing Proof of Concept on FLUME for Pre-Processing, Increased Reliability and Ease of Scalability over traditional MSMQ.
Used Flume to collect the logdatafrom different resources and transfer thedatatype to hive tables using different SerDe to store in JSON, XML and Sequence file formats.
Worked with application teams to install operating system,Hadoopupdates, patches, version upgrades as required.
Supported in settling up QA environment and updating configuration for implementing scripts with Pig and Sqoop.
Implemented testing scripts to support test driven development and continuous integration.

Environment: Hadoop, MapReduce, HDFS, Hive, Java (jdk1.6),Hadoopdistribution of HortonWorks, Cloudera, MapR, DataStax, IBM DataStage 8.1(Designer, Director, Administrator), PL/SQL, SQL*PLUS, Toad 9.6, Windows NT, UNIX Shell Scripting

Confidential, Arlington, VA

Sr. Java/J2EE Developer

Responsibilities:

Developed the application using Struts Framework that leverages classical Model View Layer (MVC) architecture UML diagrams like use cases, class diagrams, interaction diagrams and activity diagrams
Participated in requirement gathering and converting the requirements into technical specifications
Extensively worked on User Interface for few modules using JSPs, JavaScript and Ajax
Created Business Logic using Servlets, Session beans and deployed them on Web logic server.
Developed the XML Schema and Web services for the data maintenance and structures
Implemented the Web Service client for the login authentication, credit reports and applicant information using Apache Axis 2 Web Service.
Developed workflows using custom MapReduce, Pig, Hive and Sqoop.
Built reusable Hive UDF libraries for business requirements which enabled users to use these UDF's in Hive Querying.
Developed a data pipeline using Kafka and Storm to store data into HDFS.
Maintained third party software, and database(s) with updates/upgrades, performance tuning and monitoring
Developed multiple MapReduce jobs injavafor data cleaning and preprocessing.
Created UDFs to calculate the pending payment for the given Residential or Small Business customer, and used in Pig and Hive Scripts.
Responsible to manage data coming from different sources.
Developed Shell, Perl and Python scripts to automate and provide Control flow to Pig scripts.
Used Hibernate ORM framework with spring framework for data persistence and transaction management.
Wrote test cases in Junit for unit testing of classes
Involved in templates and screens in HTML and JavaScript
Involved in integrating Web Services using WSDL and UDDI
Built and deployedJavaapplications into multiple Unix based environments and produced both unit and functional test results along with release notes

Environment: JDK 1.5, J2EE 1.4, Struts 1.3, Kafka, Storm JSP, Servlets 2.5, WebSphere 6.1, HTML, XML, ANT 1.6, Perl, Python, JavaScript, Junit 3.8

Confidential

Sr. Java/J2EE Developer

Responsibilities:

Designed and developed the application using Agile methodology.
Implementation of new module development, new change requirement, fixes the code. Defect fixing for defects identified in pre-production environments and production environment.
Wrote technical design document with class, sequence, and activity diagrams in each use case.
Created Wiki pages using Confluence Documentation.
Developed various reusable helper and utility classes which were used across all modules of the application.
Involved in developing XML compilers using XQuery.
Developed the Application using Spring MVC Framework by implementing Controller, Service classes.
Involved in writing Spring Configuration XML file that contains declarations and other dependent objects declaration.
Used Hibernate for persistence framework, Involved in creating DAO's and used Hibernate for ORM mapping.
WrittenJavaclasses to test UI and Web services through JUnit.
Performed functional and integration testing, extensively involved in release/deployment related critical activities. Responsible for designing Rich user Interface Applications using JSP, JSP Tag libraries, Spring Tag libraries, JavaScript, CSS, HTML
Used SVN for version control. Log4J was used to log both User Interface and Domain Level Messages.
Used Soap UI for testing the Web services.
Use of MAVEN for dependency management and structure of the project
Create the deployment document on various environments such as Test, QC, and UAT.
Involved in system wide enhancements supporting the entire system and fixing reported bugs.
Explored Spring MVC, SpringIOC, Spring AOP and Hibernate in creating the POC.

Environment: Java, J2EE, JSP, Spring, Hibernate, CSS, JavaScript, Oracle, JBoss, Maven, Eclipse, JUnit, Log4J, AJAX, Web services,JNDI, JMS, HTML, XML, XSD, XML Schema.

We provide IT Staff Augmentation Services!

Big Data Engineer Resume

Bronx, NY

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship