Big Data Developer/Analyst Resume New York, NY - Hire IT People

SUMMARY:

Over 5 years IT experience including 3 years on Big Data Ecosystem and 2 years on Java EE application development.
Experience in Media, Retails, and Finance domains.
Expertise in Hadoop Architecture such as YARN architecture and deep understanding of workload management, schedulers, scalability and distributed platform architectures.
Experienced with distributions including Cloudera CDH 5.4, Amazon EMR 4.x and Hortonworks HDP 2.2
Extensive experience in writing MapReduce jobs with Java API to parse and analyze unstructured data
Extensive experience in writing PIG Latin script and HiveQL/Impala queries to process and analyze large volumes of data structured in different level.
Hands on experience on cluster security and authentication with Kerberos
Good Knowledge on serialization formats like Sequence File, Avro and Parquet
Experienced in developing Spark applications using Scala and Python
Expertise in collecting, aggregating and moving large amounts of streaming data using Flume, Kafka, RabbitMQ, Spark Streaming
Experienced in extract, transform, and load (ETL) data from multiple federated data sources (JSON, relational database, etc.) with DataFrames in Spark.
Extensive experience in importing and exporting data using Sqoop from HDFS/Hive/HBase to Relational Database Systems (RDBMS) and vice versa.
Strong experience in writing custom UDFs in Java for HIVE and Pig to extend the functionality.
Well understanding about Tachyon and BlinkDB , and spark GraphX .
Experience in designing both time driven and data driven automated workflows using Oozie.
Strong in core java, data structure, algorithms design, Object - Oriented Design(OOD) and Java components like Collections Framework, Exception handling, I/O system, and Multithreading
Hands on experience in MVC architecture and Java EE frameworks like Struts2, Spring MVC, and Hibernate.
Hands on experience in Hadoop cluster administration & performance tuning.
Experienced in Docker platform for application development and testing.
Worked on various cloud environments like AWS, Heroku.
Extensive Experience in Unit Testing with JUnit, MRUnit, Pytest
Worked in development environment like Git, JIRA, Jenkins, Agile/Scrum and Waterfall with TDD (Test Drive Development) methodologies.
Experienced in Agile or Spiral Development environments
A good team-player and work independently in a fast-paced multitasking environment, and a self-motivated learner.

TECHNICAL SKILLS:Apache Hadoop Eco-system: \ Relational Databases HDFS, MapReduce V1, MapReduce V2, \ Oracle 11g/10g/9i/, MySQL 5.0, Microsoft \ YARN, Hive 1.2.4, Pig 0.14.0, Sqoop, \ SQL Server 9.0, PostgreSQL 8.0, \ ZooKeeper 3.4.6, Flume1.4.0, Kafka 0.8.0, \ RabbitMQ, Spark 2.1.0, Oozie 4.0.1, Avro, \ Kerberos, MRUnit\

NoSQL Databases: \ Scripting MongoDB 3.2, Cassandra, HBase 0.98\ UNIX Shell Scripting\

Languages: \ Operation System Java, Scala, Python, SQL, HiveQL, Pig Latin\ Linux, Windows, Mac OS\

Environment\ IDE Application\ Agile, Spiral, Waterfall\ Sublime Text, Eclipse, PyCharm, Notepad++\:

Collaboration: Git, JIRA, Jenkins\

PROFESSIONAL EXPERIENCE:

Confidential

New York, NY

Big Data Developer/Analyst

Responsibilities:

Work on Confidential with Agile methodology
Develop Kafka consumer to receive and store real time data from sources
Implement Flume to collect, aggregate, and store web log data from different sources to Kafka
Configure the S qoop jobs for importing the input (raw) data from RDBMS and HBase
Extract data from MongoDB through MongoDB Connector for Hadoop
Experience in migrating of MapReduce programs into Spark using Scala
Write Spark Streaming code to process real-time data from Kafka
Develop Spark with Scala and Spark SQL for testing and processing of data
Cooperate with analytics team to build statistical model with MLlib and PySpark as well as prepare and visualize tables in Tableau for reporting
Create Oozie coordinated workflow to execute Sqoop jobs.
Perform unit testing using JUnit and Pytest
Use Git for version control, JIRA for project tracking and Jenkins for continuous integration

Environment: Hadoop 2.6, Amazon EMR, HDFS, MapReduce, HBase, Sqoop1.4.5, Flume1.5, MongoDB, Spark 1.4, Spark SQL, Pyspark, MLlib, Tableau 9.2, JUnit, Pytest

Confidential

New York, NY

Senior Hadoop Developer

Responsibilities:

Involved in meeting and releasing, working closely with my teammates and managers.
Implemented Flume to import log data from web server into HDFS.
Translated functional and technical requirements into detail programs running on Hadoop MapReduce and Spark
Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
Programmed Spark code using Scala for faster processing of data.
Wrote traditional database code and distributed system code (mainly HiveQL)
Migrated data between RDBMS and HDFS/Hive with Sqoop
Experience in creating Hive tables, loading data and writing Hive queries.
Hands on using Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive for optimized performance.
Used HBase for scalable storage and fast query
Worked on Git for version control, JIRA for project tracking and Jenkins for continuous integration
Cooperated with analytics team to prepare and visualize tables in Tableau for reporting
Experience in application performance tuning and troubleshooting

Environment: Hadoop2.0, HBase1.1.4, MapReduce, Spark1.4, Flume1.5.0, Sqoop1.4.6, Tableau9.2, Hive1.2.1, MySQL 5.6, Scala2.11.x

Confidential

Newark, NJ

Hadoop Big Data Developer

Responsibilities:

Worked with large scale distributed data solution Cloudera CDH4 cluster.
Hands on writing MapReduce code to make unstructured data as structured data and for inserting data into MongoDB.
Used Sqoop to import and export data among HDFS, MySQL database and Hive.
Experience in analyzing data using HiveQL, Pig Latin, and custom MapReduce programs in java.
Developed customized UDF’s in java to extend Hive and Pig Latin functionality.
Performed unit testing using MRUnit.
Used Oozie to orchestrate the MapReduce jobs in order to setup automated workflow

Environment: CDH4, Hadoop1.2.1, Java JDK1.6, MapReduce, Pig 0.13.0, Hive, Sqoop1.4.5, Flume, Oozie, MongoDB 2.4.9, Java (jdk1.6)

Confidential

Java Developer

Responsibilities:

Designed and coded application components with JSP, Servlet and AJAX
Implemented data persistency using JDBC for database connectivity and Hibernate for database/java object mapping.
Designed the logical and physical data model, generated DDL, DML scripts
Designed user-interface and used JavaScript to check validations.
Wrote SQL queries, stored procedures and database triggers as required on the database objects.

Environment: Java, XML, Hibernate, SQL Server, Maven2, JUnit, J2EE (JSP, Java beans, DAO), Eclipse, Apache Tomcat Server, Spring MVC, Spiral Methodology

Confidential

Jr. Java Developer

Responsibilities:

Involved in system design, which is based on Spring Struts Hibernate framework.
Implemente the business logic in standalone Java classes using core Java.
Developed database (MySQL) applications.
Worked in Spring Hibernate Template to access the MySQL database.
Involved in Unit testing of the components and created unit test cases and did unit test review.

Environment: Eclipse, MySQL Client 4.1, spring, HTML, JavaScript, Hibernate, JSF, Junit, SDLC: Agile/Scrum

We provide IT Staff Augmentation Services!

Big Data Developer/analyst Resume

New York, NY

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship