We provide IT Staff Augmentation Services!

Big Data Developer/analyst Resume

3.00/5 (Submit Your Rating)

New York, NY

SUMMARY:

  • Over 5 years IT experience including 3 years on Big Data Ecosystem and 2 years on Java EE application development.
  • Experience in Media, Retails, and Finance domains.
  • Expertise in Hadoop Architecture such as YARN architecture and deep understanding of workload management, schedulers, scalability and distributed platform architectures.
  • Experienced with distributions including Cloudera CDH 5.4, Amazon EMR 4.x and Hortonworks HDP 2.2
  • Extensive experience in writing MapReduce jobs with Java API to parse and analyze unstructured data
  • Extensive experience in writing PIG Latin script and HiveQL/Impala queries to process and analyze large volumes of data structured in different level.
  • Hands on experience on cluster security and authentication with Kerberos
  • Good Knowledge on serialization formats like Sequence File, Avro and Parquet
  • Experienced in developing Spark applications using Scala and Python
  • Expertise in collecting, aggregating and moving large amounts of streaming data using Flume, Kafka, RabbitMQ, Spark Streaming
  • Experienced in extract, transform, and load (ETL) data from multiple federated data sources (JSON, relational database, etc.) with DataFrames in Spark.
  • Extensive experience in importing and exporting data using Sqoop from HDFS/Hive/HBase to Relational Database Systems (RDBMS) and vice versa.
  • Strong experience in writing custom UDFs in Java for HIVE and Pig to extend the functionality.
  • Well understanding about Tachyon and BlinkDB , and spark GraphX .
  • Experience in designing both time driven and data driven automated workflows using Oozie.
  • Strong in core java, data structure, algorithms design, Object - Oriented Design(OOD) and Java components like Collections Framework, Exception handling, I/O system, and Multithreading
  • Hands on experience in MVC architecture and Java EE frameworks like Struts2, Spring MVC, and Hibernate.
  • Hands on experience in Hadoop cluster administration & performance tuning.
  • Experienced in Docker platform for application development and testing.
  • Worked on various cloud environments like AWS, Heroku.
  • Extensive Experience in Unit Testing with JUnit, MRUnit, Pytest
  • Worked in development environment like Git, JIRA, Jenkins, Agile/Scrum and Waterfall with TDD (Test Drive Development) methodologies.
  • Experienced in Agile or Spiral Development environments
  • A good team-player and work independently in a fast-paced multitasking environment, and a self-motivated learner.

TECHNICAL SKILLS:Apache Hadoop Eco-system: \ Relational Databases HDFS, MapReduce V1, MapReduce V2, \ Oracle 11g/10g/9i/, MySQL 5.0, Microsoft \ YARN, Hive 1.2.4, Pig 0.14.0, Sqoop, \ SQL Server 9.0, PostgreSQL 8.0, \ ZooKeeper 3.4.6, Flume1.4.0, Kafka 0.8.0, \ RabbitMQ, Spark 2.1.0, Oozie 4.0.1, Avro, \ Kerberos, MRUnit\

NoSQL Databases: \ Scripting MongoDB 3.2, Cassandra, HBase 0.98\ UNIX Shell Scripting\

Languages: \ Operation System Java, Scala, Python, SQL, HiveQL, Pig Latin\ Linux, Windows, Mac OS\

Environment\ IDE Application\ Agile, Spiral, Waterfall\ Sublime Text, Eclipse, PyCharm, Notepad++\:

Collaboration:  Git, JIRA, Jenkins\

PROFESSIONAL EXPERIENCE:

Confidential

New York, NY

Big Data Developer/Analyst

Responsibilities:

  • Work on  Confidential with Agile methodology
  • Develop Kafka consumer to receive and store real time data from sources
  • Implement Flume to collect, aggregate, and store web log data from different sources to Kafka
  • Configure the S qoop jobs for importing the input (raw) data from RDBMS and HBase
  • Extract data from MongoDB through MongoDB Connector for Hadoop
  • Experience in migrating of MapReduce programs into Spark using Scala
  • Write Spark Streaming code to process real-time data from Kafka
  • Develop Spark with Scala and Spark SQL for testing and processing of data
  • Cooperate with analytics team to build statistical model with MLlib and PySpark as well as prepare and visualize tables in Tableau for reporting
  • Create Oozie coordinated workflow to execute Sqoop jobs.
  • Perform unit testing using JUnit and Pytest
  • Use Git for version control, JIRA for project tracking and Jenkins for continuous integration

Environment: Hadoop 2.6, Amazon EMR, HDFS, MapReduce, HBase, Sqoop1.4.5, Flume1.5, MongoDB, Spark 1.4, Spark SQL, Pyspark, MLlib, Tableau 9.2, JUnit, Pytest

Confidential

New York, NY

Senior Hadoop Developer

Responsibilities:

  • Involved in meeting and releasing, working closely with my teammates and managers.
  • Implemented Flume to import log data from web server into HDFS.
  • Translated functional and technical requirements into detail programs running on Hadoop MapReduce and Spark
  • Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
  • Programmed Spark code using Scala for faster processing of data.
  • Wrote traditional database code and distributed system code (mainly HiveQL)
  • Migrated data between RDBMS and HDFS/Hive with Sqoop
  • Experience in creating Hive tables, loading data and writing Hive queries.
  • Hands on using Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive for optimized performance.
  • Used HBase for scalable storage and fast query
  • Worked on Git for version control, JIRA for project tracking and Jenkins for continuous integration
  • Cooperated with analytics team to prepare and visualize tables in Tableau for reporting
  • Experience in application performance tuning and troubleshooting

Environment: Hadoop2.0, HBase1.1.4, MapReduce, Spark1.4, Flume1.5.0, Sqoop1.4.6, Tableau9.2, Hive1.2.1, MySQL 5.6, Scala2.11.x

Confidential

Newark, NJ

Hadoop Big Data Developer

Responsibilities:

  • Worked with large scale distributed data solution Cloudera CDH4 cluster.
  • Hands on writing MapReduce code to make unstructured data as structured data and for inserting data into MongoDB.
  • Used Sqoop to import and export data among HDFS, MySQL database and Hive.
  • Experience in analyzing data using HiveQL, Pig Latin, and custom MapReduce programs in java.
  • Developed customized UDF’s in java to extend Hive and Pig Latin functionality.
  • Performed unit testing using MRUnit.
  • Used Oozie to orchestrate the MapReduce jobs in order to setup automated workflow

Environment: CDH4, Hadoop1.2.1, Java JDK1.6, MapReduce, Pig 0.13.0, Hive, Sqoop1.4.5, Flume, Oozie, MongoDB 2.4.9, Java (jdk1.6)

Confidential

Java Developer

Responsibilities:

  • Designed and coded application components with JSP, Servlet and AJAX
  • Implemented data persistency using JDBC for database connectivity and Hibernate for database/java object mapping.
  • Designed the logical and physical data model, generated DDL, DML scripts
  • Designed user-interface and used JavaScript to check validations.
  • Wrote SQL queries, stored procedures and database triggers as required on the database objects.

Environment: Java, XML, Hibernate, SQL Server, Maven2, JUnit, J2EE (JSP, Java beans, DAO), Eclipse, Apache Tomcat Server, Spring MVC, Spiral Methodology

Confidential

Jr. Java Developer

Responsibilities:

  • Involved in system design, which is based on Spring Struts Hibernate framework.
  • Implemente the business logic in standalone Java classes using core Java.
  • Developed database (MySQL) applications.
  • Worked in Spring Hibernate Template to access the MySQL database.
  • Involved in Unit testing of the components and created unit test cases and did unit test review.

Environment: Eclipse, MySQL Client 4.1, spring, HTML, JavaScript, Hibernate, JSF, Junit, SDLC: Agile/Scrum

We'd love your feedback!