We provide IT Staff Augmentation Services!

Big Data Developer/analyst Resume

5.00/5 (Submit Your Rating)

New York, NY

SUMMARY:

  • Over 6 years of IT professional experience in Big Data Ecosystem technologies and Java/J2EE related technologies in industries including financial services, telecommunications.
  • Experienced in Big Data Ecosystem with Hadoop 2.0, HDFS, MapReduce, Pig 0.12+, Hive1.0+, HDFS, HBase 0.98+, Sqoop1.3+, Flume 1.3+, Kafka 1.2+, Oozie 3.0+, and Spark 1.3+
  • Experienced with distributions including Cloudera CDH 5.8 and Hortonworks HDP 2.4
  • Experienced in Databases, including
  • Worked with RDBMS including MySQL, Oracle SQL, PostgreSQL
  • Worked with NoSQL Database including HBase, MongoDB and Cassandra
  • Experienced in writing MapReduce programs to parse and analyze unstructured data
  • Involved in writing HiveQL queries to process and analyze data
  • Experienced in writing custom UDFs with Scala 2.10+ to extend Hive core functionality
  • Experienced in using Sqoop/Flume to transfer data between RDBMS, NoSQL databases and HDFS
  • Utilized Kafka, RabbitMQ and Flume to gain real - time data stream in HDFS and HBase from the different data sources
  • Applied other Hadoop ecosystem tools in jobs such as Zookeeper,Oozie
  • Excellent Object Oriented Programming (OOP) skills with C++ and Java and in-depth understanding of data structures and algorithms.
  • Experienced in Graphic and UI design with Adobe PhotoShop
  • Experienced in all the phases of Data warehouse life cycle involving requirement analysis, design, coding, testing, and deployment
  • Strong knowledge of Linux/Unix Shell Commands
  • Involved in Tableau Server Configuration and Dashboard building
  • Developed Machine Learning algorithms including Linear Regression, Logistic Regression, K-Means, Decision Trees
  • Good knowledge of Unit Testing with Pytest, ScalaCheck, ScalaTest, JUnit and MRUnit
  • Worked with developing environments like JIRA, Confluence, Agile/Scrum and Waterfall
  • Self-motivated challenger, successfully working in fast-paced multitasking environment both independently and in collaborative team, dynamic problem-solving, and enthusiastic learner

TECHNICAL SKILLS:

Hadoop Eco Systems\ NoSQL: Hadoop 2.0.0+, MapReduce, HBase 0.98, HBase 0.98, Cassandra 2, MangoDB 3, Spark 1.3+, Hive 1.0+, Pig 0.12+, Kafka 1.2+, Sqoop 1.3+, Flume 1.3+, Impala 1.2+, Oozie, 3.0+, Zookeeper 3.4+

Programming Languages\ Operating System: Java 7+, Scala 2.10+, SQL, SparkSQL\ Mac OS, Ubuntu, CentOS, Windows, HiveSQL, Pig-Latin, C++, C

Database\ Machine Learning: MySQL 5.x, Oracle 10g, PostgreSQL 9.x, Linear Regression, Logistic Regression, MongoDB 3.2, HBase 0.98\ K-Means, Decision Trees

PROFESSIONAL EXPERIENCE:

Confidential, New York, NY

Big data developer/Analyst

Responsibilities:

  • Designed data pipeline using Flume, Sqoop to ingest customers’ data into HDFS
  • Developed multiple MapReduce jobs in Java for data cleaning
  • Wrote customized UDFs with Scala for data preprocessing.
  • Worked with multiple data formats ( XML, CSV, JSON, Avro) and imported data into Hive
  • Wrote customized Hive UDFs (user defined function) for data transformation
  • Built star schema data model(Fact/Dim tables) using Kimball Approach for data analysis
  • Worked with various compression hive file formats, such as gzip,bzip2,LZO,and Snappy
  • Saved aggregation result into tables for fast data retrieval
  • Pushed cleansed data set into Hbase using Sqoop and developed BI reports using Tableau designed workflow in Oozie to automate tasks of loading data
  • Involved in design and development phases of Software Development Life Cycle using Scrum methodology
  • Performed unit testing using JUnit and MRUnit
  • Used Git for version control and JIRA for project tracking

Environment: Red Hat Linux, HDFS, MapReduce, Hive, Java, Sqoop, Ooize, CDH, tableau, Hbase, Flume, Eclipse, JIRA, Junit, MrUnit, Scala

Confidential, Richmond, VA

Big Data Developer

Responsibilities:

  • Extract data from various source systems (Oracle, MySQL, SQL Server, MongoDB, log files) to HDFS cluster using Sqoop, Flume
  • Implemented Hive UDFs to in corporate business logic into Hive Queries
  • Developed multiple MapReduce jobs in Java for data cleaning and preprocessing
  • Configured Kafka producers/consumers and Kafka cluster to serve as a temporary data storage
  • Persisted ingested high-throughput data in Cassandra
  • Processed semi-structured data into structured using spark core, spark sql
  • Analyzed real time data using Spark Streaming
  • Worked on Oozie to automate data load jobs into HDFS and HIVE
  • Involved in managing and reviewing Hadoop log files
  • Involved in handling the issues related to cluster start, node failures on the system.
  • Performed unit testing for Spark and Spark Streaming with Pytest, ScalaCheck
  • Used JIRA for project tracking and Jenkins for continuous integration

Environment: Hadoop 2.6, Cloudera CDH 5.4, HDFS, MapReduce, Kafka, Ooize, Pig, Hive, Sqoop, JIRA, Jenkins, Cassandra, MongoDB

Confidential, Herndon, VA

Hadoop Developer

Responsibilities:

  • Installed and Configured Apache Hadoop clusters and Hadoop tools for application development includes HDFS, YARN, Sqoop, Flume, Hive, Pig, Oozie, Zookeeper and
  • HBase
  • Wrote Map Reduce job to launch and monitor computation on the cluster by using Java
  • Migrating the needed data from RDMS into HDFS using Sqoop and importing various formats of flat files into HDFS Worked on bulk load of data from enterprise data warehouse to Hadoop
  • Wrote Pig Scripts to perform transformation procedures on the data in HDFS
  • Created Oozie workflows to automate the data pipeline and schedule data by using Oozie coordinator
  • Involved in design workflow for Oozie resource management for YARN
  • Worked with serialization formats such as Json, Xml and Big data serialization formats such as Avro and Sequence Files
  • Verified importing and exporting data into HDFS and Hive using Sqoop

Environment: Hadoop, MapReduce, HDFS, Java, Pig, YARN, Sqoop, Oozie, Cassandra, Eclipse, Linux

Confidential

SQL developer

Responsibilities:

  • Observe performance of the databases and optimize system resources and SQL
  • Support internal projects by creating update procedures to fix data issues
  • Design analysis for supply chain management projects involving multiple databases/ETL and materialized views
  • Setting up database monitoring for existing environments using shell scripts
  • Read from SQL DBs, Web through APIs and processed them for further use in python with PANDAS module
  • Written SQL queries involved in the JDBC connection in accordance with the business logic

Environment: MS SQL Server 2005/2008, Visual Studio 2008, MS Access, MS Excel, Crystal Reports, SQL Server Analysis Services (SSAS)

Confidential, Fort Wayne, IN

Java/J2EE Developer

Responsibilities:

  • Developed unit test code using Java
  • Involved in Quality Test and inspection of the tests written by other engineers and generated feedback reports
  • Gathered business requirement and wrote technical report for potential customers
  • Involved in design and implement web application according to customer’s needs
  • Implemented client-side application to invoke SOAP and REST Web Services

Environment: Java 7, ASP.NET, Entity framework 6, My SQL, PostgreSQL, WCF, WPF SOAP REST

Confidential, Indianapolis, IN

Front End Developer

Responsibilities:
  • Involved in SDLC Requirements gathering, Analysis, Design, Development and Testing of application
  • Created standards compliant HTML, CSS and JavaScript pages as needed
  • Developed JavaScript, jQuery with JavaScript libraries.
  • Involved in User Interface Testing to check the compatibility of web sites for multiple browsers.
  • Worked with Java back-end, utilizing AJAX to pull in and parse XML

Environment: HTML, JavaScript, JAVA, CSS, AJAX, jQuery, XML

We'd love your feedback!