Big data developer/Analyst Resume New York, NY - Hire IT People

SUMMARY:

Over 6 years of IT professional experience in Big Data Ecosystem technologies and Java/J2EE related technologies in industries including financial services, telecommunications.
Experienced in Big Data Ecosystem with Hadoop 2.0, HDFS, MapReduce, Pig 0.12+, Hive1.0+, HDFS, HBase 0.98+, Sqoop1.3+, Flume 1.3+, Kafka 1.2+, Oozie 3.0+, and Spark 1.3+
Experienced with distributions including Cloudera CDH 5.8 and Hortonworks HDP 2.4
Experienced in Databases, including
Worked with RDBMS including MySQL, Oracle SQL, PostgreSQL
Worked with NoSQL Database including HBase, MongoDB and Cassandra
Experienced in writing MapReduce programs to parse and analyze unstructured data
Involved in writing HiveQL queries to process and analyze data
Experienced in writing custom UDFs with Scala 2.10+ to extend Hive core functionality
Experienced in using Sqoop/Flume to transfer data between RDBMS, NoSQL databases and HDFS
Utilized Kafka, RabbitMQ and Flume to gain real - time data stream in HDFS and HBase from the different data sources
Applied other Hadoop ecosystem tools in jobs such as Zookeeper,Oozie
Excellent Object Oriented Programming (OOP) skills with C++ and Java and in-depth understanding of data structures and algorithms.
Experienced in Graphic and UI design with Adobe PhotoShop
Experienced in all the phases of Data warehouse life cycle involving requirement analysis, design, coding, testing, and deployment
Strong knowledge of Linux/Unix Shell Commands
Involved in Tableau Server Configuration and Dashboard building
Developed Machine Learning algorithms including Linear Regression, Logistic Regression, K-Means, Decision Trees
Good knowledge of Unit Testing with Pytest, ScalaCheck, ScalaTest, JUnit and MRUnit
Worked with developing environments like JIRA, Confluence, Agile/Scrum and Waterfall
Self-motivated challenger, successfully working in fast-paced multitasking environment both independently and in collaborative team, dynamic problem-solving, and enthusiastic learner

TECHNICAL SKILLS:

Hadoop Eco Systems\ NoSQL: Hadoop 2.0.0+, MapReduce, HBase 0.98, HBase 0.98, Cassandra 2, MangoDB 3, Spark 1.3+, Hive 1.0+, Pig 0.12+, Kafka 1.2+, Sqoop 1.3+, Flume 1.3+, Impala 1.2+, Oozie, 3.0+, Zookeeper 3.4+

Programming Languages\ Operating System: Java 7+, Scala 2.10+, SQL, SparkSQL\ Mac OS, Ubuntu, CentOS, Windows, HiveSQL, Pig-Latin, C++, C

Database\ Machine Learning: MySQL 5.x, Oracle 10g, PostgreSQL 9.x, Linear Regression, Logistic Regression, MongoDB 3.2, HBase 0.98\ K-Means, Decision Trees

PROFESSIONAL EXPERIENCE:

Confidential, New York, NY

Big data developer/Analyst

Responsibilities:

Designed data pipeline using Flume, Sqoop to ingest customers’ data into HDFS
Developed multiple MapReduce jobs in Java for data cleaning
Wrote customized UDFs with Scala for data preprocessing.
Worked with multiple data formats ( XML, CSV, JSON, Avro) and imported data into Hive
Wrote customized Hive UDFs (user defined function) for data transformation
Built star schema data model(Fact/Dim tables) using Kimball Approach for data analysis
Worked with various compression hive file formats, such as gzip,bzip2,LZO,and Snappy
Saved aggregation result into tables for fast data retrieval
Pushed cleansed data set into Hbase using Sqoop and developed BI reports using Tableau designed workflow in Oozie to automate tasks of loading data
Involved in design and development phases of Software Development Life Cycle using Scrum methodology
Performed unit testing using JUnit and MRUnit
Used Git for version control and JIRA for project tracking

Environment: Red Hat Linux, HDFS, MapReduce, Hive, Java, Sqoop, Ooize, CDH, tableau, Hbase, Flume, Eclipse, JIRA, Junit, MrUnit, Scala

Confidential, Richmond, VA

Big Data Developer

Responsibilities:

Extract data from various source systems (Oracle, MySQL, SQL Server, MongoDB, log files) to HDFS cluster using Sqoop, Flume
Implemented Hive UDFs to in corporate business logic into Hive Queries
Developed multiple MapReduce jobs in Java for data cleaning and preprocessing
Configured Kafka producers/consumers and Kafka cluster to serve as a temporary data storage
Persisted ingested high-throughput data in Cassandra
Processed semi-structured data into structured using spark core, spark sql
Analyzed real time data using Spark Streaming
Worked on Oozie to automate data load jobs into HDFS and HIVE
Involved in managing and reviewing Hadoop log files
Involved in handling the issues related to cluster start, node failures on the system.
Performed unit testing for Spark and Spark Streaming with Pytest, ScalaCheck
Used JIRA for project tracking and Jenkins for continuous integration

Environment: Hadoop 2.6, Cloudera CDH 5.4, HDFS, MapReduce, Kafka, Ooize, Pig, Hive, Sqoop, JIRA, Jenkins, Cassandra, MongoDB

Confidential, Herndon, VA

Hadoop Developer

Responsibilities:

Installed and Configured Apache Hadoop clusters and Hadoop tools for application development includes HDFS, YARN, Sqoop, Flume, Hive, Pig, Oozie, Zookeeper and
HBase
Wrote Map Reduce job to launch and monitor computation on the cluster by using Java
Migrating the needed data from RDMS into HDFS using Sqoop and importing various formats of flat files into HDFS Worked on bulk load of data from enterprise data warehouse to Hadoop
Wrote Pig Scripts to perform transformation procedures on the data in HDFS
Created Oozie workflows to automate the data pipeline and schedule data by using Oozie coordinator
Involved in design workflow for Oozie resource management for YARN
Worked with serialization formats such as Json, Xml and Big data serialization formats such as Avro and Sequence Files
Verified importing and exporting data into HDFS and Hive using Sqoop

Environment: Hadoop, MapReduce, HDFS, Java, Pig, YARN, Sqoop, Oozie, Cassandra, Eclipse, Linux

Confidential

SQL developer

Responsibilities:

Observe performance of the databases and optimize system resources and SQL
Support internal projects by creating update procedures to fix data issues
Design analysis for supply chain management projects involving multiple databases/ETL and materialized views
Setting up database monitoring for existing environments using shell scripts
Read from SQL DBs, Web through APIs and processed them for further use in python with PANDAS module
Written SQL queries involved in the JDBC connection in accordance with the business logic

Environment: MS SQL Server 2005/2008, Visual Studio 2008, MS Access, MS Excel, Crystal Reports, SQL Server Analysis Services (SSAS)

Confidential, Fort Wayne, IN

Java/J2EE Developer

Responsibilities:

Developed unit test code using Java
Involved in Quality Test and inspection of the tests written by other engineers and generated feedback reports
Gathered business requirement and wrote technical report for potential customers
Involved in design and implement web application according to customer’s needs
Implemented client-side application to invoke SOAP and REST Web Services

Environment: Java 7, ASP.NET, Entity framework 6, My SQL, PostgreSQL, WCF, WPF SOAP REST

Confidential, Indianapolis, IN

Front End Developer

Responsibilities:

Involved in SDLC Requirements gathering, Analysis, Design, Development and Testing of application
Created standards compliant HTML, CSS and JavaScript pages as needed
Developed JavaScript, jQuery with JavaScript libraries.
Involved in User Interface Testing to check the compatibility of web sites for multiple browsers.
Worked with Java back-end, utilizing AJAX to pull in and parse XML

Environment: HTML, JavaScript, JAVA, CSS, AJAX, jQuery, XML

We provide IT Staff Augmentation Services!

Big Data Developer/analyst Resume

New York, NY

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship