We provide IT Staff Augmentation Services!

Hadoop/bigdata Developer Resume

0/5 (Submit Your Rating)

New, YorK

PROFESSIONAL SUMMARY:

  • Over 5+ years of IT experience in Apache Hadoop, Java/J2EE, Nodejs in agile team environment
  • 3+ years of hands on experience in utilizing Apache Hadoop and related technologies such as MapReduce, Hive, PIG, Hbase, Sqoop, Flume, Oozie, Zookeeper, Kafka, Spark, Solr, Avro, Impala, Mongodb
  • Extensive experience in developing MapReduce Jobs using Java and thorough understanding of mapreduce infrastructure framework
  • Excellent understanding of Hadoop architecture and various components of Hadoop ecosystem such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, MapReduce & YARN.
  • Experience in implementing Hbase, Mongodb, Solr, Elastic search, apache manifoldcf
  • Optimization/ performance tuning of MR, PIG and Hive Queries
  • Experience in importing and exporting data from HDFS to Relational Database Systems and vice - versa using Apache Hadoop Eco-System
  • Knowledge of using Talend/ETL tool to import/export data
  • Hands on experience in real time ingestion of data into HDFS using Flume
  • Defined UDFs using PIG and Hive in order to capture customer behaviour
  • Create Hive external tables on the map reduce output before partitioning, bucketing is applied on top of it.
  • Experience in deployment and configuration of Hadoop Clusters
  • Knowledge Apache Oozie for scheduling and managing the Hadoop Jobs
  • Knowledge on HCatalog for Hadoop based storage management
  • Python hands on experience for processing job with Apache Spark
  • Knowledge in Algorithms and Data Structures used in Machine Learning
  • Proficient in implementing Spark with Scala
  • Knowledge in implementation of Apache Storm
  • Developed and maintained several batch jobs to run automatically depending on business requirements
  • Expertise in Collections and Multithreading
  • Experience in creating RESTful Web Service.
  • Experience in writing JavaScript Server-Side code
  • Experience in designing and interpreting UML Diagrams
  • Knowledge of testing the application code according to the business requirement.
  • Knowledge in Configuring the Elastic Search engine
  • Knowledge in implementing the PL/SQL procedures
  • Good Knowledge on Unix Shell Scripting for loading data into Hadoop Cluster
  • Experience in Code, Build and Deployment in CI (Continuous Integration) server environment using Jenkins
  • Good experience in developing applications using open source technologies
  • Good interpersonal skills and a motivated team player

TECHNICAL SKILLS:

Hadoop Languages: MapReduce, Pig, Hive, Oozie, Sqoop, HCatalog, Zookeeper, Hbase, Flume, Kafka, Spark, Impala, Storm

Languages: Java, Nodejs, SQL, PL/SQL, Python, Scala

Core Java: Data Structures, Collections, Multithreading

J2EE Technologies: JSP, JavaServlets, JDBC, Hibernate

Cloud technologies: Amazon EC2, S3, AWS cloudsearch, EMR, Dynamo DB

Database (NOSQL/RDBMS): Oracle, DB2, Cassandra, MongoDB, Microsoft SQL Server, Redis

Search Engine/Tools: Solr, Elastic search, Apache ManifoldCF

Web technologies: JavaScript, jQuery, Ajax, AngularJs, XML, Json

IDE s: Eclipse, NetBeans

Operating Systems: Windows Vista/7/8, UNIX, Linux

Build Tools: Jenkins, Maven, Docker, Gradle, Ant

Version Control: GIT, SVN

Testing: Junit, Log4j, Karma

PROFESSIONAL EXPERIENCE:

Confidential, New York

Hadoop/Bigdata Developer

Responsibilities:

  • Designed and implemented map reduce jobs to support distributed processing using Java, HDFS, Hive, PIG
  • Developed and implemented MapReduce jobs on YARN and Hadoop Cluster
  • Installed and configured Hive, Pig, Sqoop, Flume and Oozie on theHadoopcluster
  • Detailed analysis and understanding of current system and find out the different sources of data Involved in Cluster setup
  • Implemented custom Datatypes, Input Format, Record Reader, Output Format, Record Writer for MapReduce computations
  • Developed various MapReduce jobs using Java - joining multi datasets, composite keys, secondary sorting
  • Performed Batch processing of logs from various data sources using MapReduce
  • In-depth understanding of Data Structures and Algorithms. Have hands on experience in writing MapReduce jobs in Java and Pig
  • Ingest Raw data feeds of the customers
  • Exposure to Amazon Web Services - AWS cloud computing (EMR, EC2 and S3 services)
  • Used Pig and Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data. Also have hand on Experience on Pig and Hive User Define Functions (UFD)
  • Used Apache Sqoop to transfer data from Relational databases into HDFS
  • Loaded the Transformed data from HDFS into HBase to further use the data for Analytics Purpose
  • Wrote Pig Scripts for data cleaning and pre-processing the data
  • Monitoring cluster coordination services through Zookeeper
  • Expertise with NoSQL databases like Hbase, Cassandra, Dynamo DB (AWS) and MongoDB
  • Used Oozie workflow engine to manage interdependentHadoopjobs and to automate several types ofHadoopjobs
  • Performed Data transformations in HIVE and used partitions, buckets for performance improvements.
  • Worked as an integral part of the development team with troubleshooting development and maintenance of systems.
  • Worked with the development team to create appropriate cloud solutions for client needs.
  • Knowledge of writing Hive and Pig queries and running both scripts.
  • Joining of two or more datasets to filter out the unwanted data.
  • Worked with hive complex data types and involved in Bucketing.
  • Import and export data between the environment of MySQL and HDFS
  • Cloudera Manger was used to monitor the Jobs which are running on the cluster

Environment: ApacheHadoop, Hive, HDFS, Java Map-Reduce, EMR, EC2, S3, Dynamo DB, Core Java, GIT, UNIX, MYSQL, Eclipse, Oozie, Sqoop, Flume, MySQL, Hue, Cloudera Manager

Confidential, Virginia

Hadoop Developer

Responsibilities:

  • Developed MapReduce jobs for the users to Maintain, update and schedule the periodic jobs which range from updates on periodic MapReduce jobs
  • Transformed the structured and unstructured data stored into HDFS
  • Used Amazon EC2 to perform the MapReduce Jobs on the Cloud
  • Ingestion of data intoHadoopusing Sqoop and apply data transformations using Pig and HIVE.
  • Loading daily traffic data from online transactions into the HDFS using Apache Flume
  • Used Pig for data pre-processing
  • Performed periodic data dumps and related datasets into HDFS using Sqoop to run clustering MR jobs
  • Used combiners for MapReduce Jobs to reduce the load on the Reducers
  • Processing data of two large data sets using MR
  • Implemented Secondary Sort on the values outputted from the reducer
  • Sending the same Key and Value data into the partitioner to avoid the processing twice on reducer
  • Analysed the data using Pig and written Pig scripts using grouping, joining and sorting the data
  • Knowledge of writing Hive and Pig queries and running both scripts in tez mode to improve performance on Hortonworks Data Platform.
  • Used AVRO for data serialization.
  • Very Good experience on UNIX shell scripting
  • Responsible for creating Hive tables, loading the structured data resulted from MapReduce jobs into the tables and writing hive queries to further analyse the transactions to identify issues and behavioural patterns

Environment: Hadoop, Java, MapReduce, Apache Pig, Flume, Sqoop, Hive, HDFS, Hortonworks, Eclipse, EC2

Confidential

Full Stack Developer

Responsibilities:

  • Involved in designing the front end applications using web technologies like HTML 5, JSON, CSS3, AngularJS.
  • Building Web pages that are more user-interactive using jQuery plugins, AJAX, JavaScript.
  • Worked with Cross-Browser Compatible issues
  • Created various controllers, views using MVC framework Angular JS.
  • Interacted with java controllers (jQuery, Ajax, and JSON to write/read data from back end systems).
  • Implemented Single Page Application using ngRoute and $routeProvider to make application more light weight.
  • Created reusable Components using AngularJS custom directives, custom filters and factories.
  • Implemented AngularJS Service and Factory using REST Web services to fetch data from backend.
  • Experience in using JavaScript for client side validations.
  • Used Log4J to generate the log information.
  • Developed middleware Server using nodejs, Java, matlab runtime environments
  • Used apache Kafka as a high speed data gateway to capture the input EMG data at rate of 200Hz and IMU data is streamed at 50Hz
  • Used NOSQL Mongodb as a database to store the application data
  • Hosted middleware server, website, audio, video, images and other static files on Amazon AWS cloud
  • Developed RESTful API's in nodejs server and integrated 3rd party Java API using edge node module
  • Used Highcharts for reporting solutions with PDF export, PNG and other formats
  • Provided system security with task based permissions at users and roles level
  • Loading the huge amount of real data into the distributed file system using the Hadoop Eco System Flume.
  • Processing the data and store that into Hbase for generating reports based on the task.
  • Performed Unit Tests for the application using Jasmine
  • Integration/configuration of the application with database using MongoDB and Sqoop

Environment: Amazon AWS EC2, S3, Kafka, Flume, Sqoop, Hbase, Nodejs, MongoDB, JavaScript, Java, Matlab, webscokets, RESTAPI, Apache Cordova, Ionic framework, mangoose, edge, Jasmine, Karma, AngularJs, Myo armband, MyoJs

Confidential

JAVA DEVELOPER

Responsibilities:

  • Entire SCADA Application was developed from scratch using Java Technologies due to the nature of complex requirement.
  • Implemented SOA architecture for communications between the individual PLC’s and between PC and PLC using XML-RPC, this enables each device as a service.
  • Implemented industry best practices to improve the quality of the product development.
  • Provided a client/server solution using XML-RPC communication for scalability of the system.
  • Provided reporting solutions with PDF export format.
  • Provided complete multi-threaded communication platform for visualization, recipe download, configuration, setup, utilities, maintenance.
  • Provided system security with task based permissions
  • Task based security privileges to users and roles.
  • Implemented Alarms Web service, Messages Web service, Error Report Web Service.
  • Experience in developing applications using Eclipse IDE and running test cases
  • Analyzed, created and proposed remediation measures to fix the bugs in the application.
  • Developed action classes and configuration files for struts framework.
  • Developed Oracle stored procedures and complex SQL.
  • Created and maintained the PL/SQL procedures that run in the background freeing the current threads from database deadlocks
  • Tested the application thoroughly in Unit Testing phase using JUNIT

Environment: Java, XML-RPC, Web services, Struts, JSP, Oracle, JUnit, JDBC, UML, Eclipse, PL/SQL, Core Java, Eclipse

We'd love your feedback!