Hadoop/bigdata Developer Resume
New, YorK
PROFESSIONAL SUMMARY:
- Over 5+ years of IT experience in Apache Hadoop, Java/J2EE, Nodejs in agile team environment
- 3+ years of hands on experience in utilizing Apache Hadoop and related technologies such as MapReduce, Hive, PIG, Hbase, Sqoop, Flume, Oozie, Zookeeper, Kafka, Spark, Solr, Avro, Impala, Mongodb
- Extensive experience in developing MapReduce Jobs using Java and thorough understanding of mapreduce infrastructure framework
- Excellent understanding of Hadoop architecture and various components of Hadoop ecosystem such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, MapReduce & YARN.
- Experience in implementing Hbase, Mongodb, Solr, Elastic search, apache manifoldcf
- Optimization/ performance tuning of MR, PIG and Hive Queries
- Experience in importing and exporting data from HDFS to Relational Database Systems and vice - versa using Apache Hadoop Eco-System
- Knowledge of using Talend/ETL tool to import/export data
- Hands on experience in real time ingestion of data into HDFS using Flume
- Defined UDFs using PIG and Hive in order to capture customer behaviour
- Create Hive external tables on the map reduce output before partitioning, bucketing is applied on top of it.
- Experience in deployment and configuration of Hadoop Clusters
- Knowledge Apache Oozie for scheduling and managing the Hadoop Jobs
- Knowledge on HCatalog for Hadoop based storage management
- Python hands on experience for processing job with Apache Spark
- Knowledge in Algorithms and Data Structures used in Machine Learning
- Proficient in implementing Spark with Scala
- Knowledge in implementation of Apache Storm
- Developed and maintained several batch jobs to run automatically depending on business requirements
- Expertise in Collections and Multithreading
- Experience in creating RESTful Web Service.
- Experience in writing JavaScript Server-Side code
- Experience in designing and interpreting UML Diagrams
- Knowledge of testing the application code according to the business requirement.
- Knowledge in Configuring the Elastic Search engine
- Knowledge in implementing the PL/SQL procedures
- Good Knowledge on Unix Shell Scripting for loading data into Hadoop Cluster
- Experience in Code, Build and Deployment in CI (Continuous Integration) server environment using Jenkins
- Good experience in developing applications using open source technologies
- Good interpersonal skills and a motivated team player
TECHNICAL SKILLS:
Hadoop Languages: MapReduce, Pig, Hive, Oozie, Sqoop, HCatalog, Zookeeper, Hbase, Flume, Kafka, Spark, Impala, Storm
Languages: Java, Nodejs, SQL, PL/SQL, Python, Scala
Core Java: Data Structures, Collections, Multithreading
J2EE Technologies: JSP, JavaServlets, JDBC, Hibernate
Cloud technologies: Amazon EC2, S3, AWS cloudsearch, EMR, Dynamo DB
Database (NOSQL/RDBMS): Oracle, DB2, Cassandra, MongoDB, Microsoft SQL Server, Redis
Search Engine/Tools: Solr, Elastic search, Apache ManifoldCF
Web technologies: JavaScript, jQuery, Ajax, AngularJs, XML, Json
IDE s: Eclipse, NetBeans
Operating Systems: Windows Vista/7/8, UNIX, Linux
Build Tools: Jenkins, Maven, Docker, Gradle, Ant
Version Control: GIT, SVN
Testing: Junit, Log4j, Karma
PROFESSIONAL EXPERIENCE:
Confidential, New York
Hadoop/Bigdata Developer
Responsibilities:
- Designed and implemented map reduce jobs to support distributed processing using Java, HDFS, Hive, PIG
- Developed and implemented MapReduce jobs on YARN and Hadoop Cluster
- Installed and configured Hive, Pig, Sqoop, Flume and Oozie on theHadoopcluster
- Detailed analysis and understanding of current system and find out the different sources of data Involved in Cluster setup
- Implemented custom Datatypes, Input Format, Record Reader, Output Format, Record Writer for MapReduce computations
- Developed various MapReduce jobs using Java - joining multi datasets, composite keys, secondary sorting
- Performed Batch processing of logs from various data sources using MapReduce
- In-depth understanding of Data Structures and Algorithms. Have hands on experience in writing MapReduce jobs in Java and Pig
- Ingest Raw data feeds of the customers
- Exposure to Amazon Web Services - AWS cloud computing (EMR, EC2 and S3 services)
- Used Pig and Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data. Also have hand on Experience on Pig and Hive User Define Functions (UFD)
- Used Apache Sqoop to transfer data from Relational databases into HDFS
- Loaded the Transformed data from HDFS into HBase to further use the data for Analytics Purpose
- Wrote Pig Scripts for data cleaning and pre-processing the data
- Monitoring cluster coordination services through Zookeeper
- Expertise with NoSQL databases like Hbase, Cassandra, Dynamo DB (AWS) and MongoDB
- Used Oozie workflow engine to manage interdependentHadoopjobs and to automate several types ofHadoopjobs
- Performed Data transformations in HIVE and used partitions, buckets for performance improvements.
- Worked as an integral part of the development team with troubleshooting development and maintenance of systems.
- Worked with the development team to create appropriate cloud solutions for client needs.
- Knowledge of writing Hive and Pig queries and running both scripts.
- Joining of two or more datasets to filter out the unwanted data.
- Worked with hive complex data types and involved in Bucketing.
- Import and export data between the environment of MySQL and HDFS
- Cloudera Manger was used to monitor the Jobs which are running on the cluster
Environment: ApacheHadoop, Hive, HDFS, Java Map-Reduce, EMR, EC2, S3, Dynamo DB, Core Java, GIT, UNIX, MYSQL, Eclipse, Oozie, Sqoop, Flume, MySQL, Hue, Cloudera Manager
Confidential, Virginia
Hadoop Developer
Responsibilities:
- Developed MapReduce jobs for the users to Maintain, update and schedule the periodic jobs which range from updates on periodic MapReduce jobs
- Transformed the structured and unstructured data stored into HDFS
- Used Amazon EC2 to perform the MapReduce Jobs on the Cloud
- Ingestion of data intoHadoopusing Sqoop and apply data transformations using Pig and HIVE.
- Loading daily traffic data from online transactions into the HDFS using Apache Flume
- Used Pig for data pre-processing
- Performed periodic data dumps and related datasets into HDFS using Sqoop to run clustering MR jobs
- Used combiners for MapReduce Jobs to reduce the load on the Reducers
- Processing data of two large data sets using MR
- Implemented Secondary Sort on the values outputted from the reducer
- Sending the same Key and Value data into the partitioner to avoid the processing twice on reducer
- Analysed the data using Pig and written Pig scripts using grouping, joining and sorting the data
- Knowledge of writing Hive and Pig queries and running both scripts in tez mode to improve performance on Hortonworks Data Platform.
- Used AVRO for data serialization.
- Very Good experience on UNIX shell scripting
- Responsible for creating Hive tables, loading the structured data resulted from MapReduce jobs into the tables and writing hive queries to further analyse the transactions to identify issues and behavioural patterns
Environment: Hadoop, Java, MapReduce, Apache Pig, Flume, Sqoop, Hive, HDFS, Hortonworks, Eclipse, EC2
Confidential
Full Stack Developer
Responsibilities:
- Involved in designing the front end applications using web technologies like HTML 5, JSON, CSS3, AngularJS.
- Building Web pages that are more user-interactive using jQuery plugins, AJAX, JavaScript.
- Worked with Cross-Browser Compatible issues
- Created various controllers, views using MVC framework Angular JS.
- Interacted with java controllers (jQuery, Ajax, and JSON to write/read data from back end systems).
- Implemented Single Page Application using ngRoute and $routeProvider to make application more light weight.
- Created reusable Components using AngularJS custom directives, custom filters and factories.
- Implemented AngularJS Service and Factory using REST Web services to fetch data from backend.
- Experience in using JavaScript for client side validations.
- Used Log4J to generate the log information.
- Developed middleware Server using nodejs, Java, matlab runtime environments
- Used apache Kafka as a high speed data gateway to capture the input EMG data at rate of 200Hz and IMU data is streamed at 50Hz
- Used NOSQL Mongodb as a database to store the application data
- Hosted middleware server, website, audio, video, images and other static files on Amazon AWS cloud
- Developed RESTful API's in nodejs server and integrated 3rd party Java API using edge node module
- Used Highcharts for reporting solutions with PDF export, PNG and other formats
- Provided system security with task based permissions at users and roles level
- Loading the huge amount of real data into the distributed file system using the Hadoop Eco System Flume.
- Processing the data and store that into Hbase for generating reports based on the task.
- Performed Unit Tests for the application using Jasmine
- Integration/configuration of the application with database using MongoDB and Sqoop
Environment: Amazon AWS EC2, S3, Kafka, Flume, Sqoop, Hbase, Nodejs, MongoDB, JavaScript, Java, Matlab, webscokets, RESTAPI, Apache Cordova, Ionic framework, mangoose, edge, Jasmine, Karma, AngularJs, Myo armband, MyoJs
Confidential
JAVA DEVELOPER
Responsibilities:
- Entire SCADA Application was developed from scratch using Java Technologies due to the nature of complex requirement.
- Implemented SOA architecture for communications between the individual PLC’s and between PC and PLC using XML-RPC, this enables each device as a service.
- Implemented industry best practices to improve the quality of the product development.
- Provided a client/server solution using XML-RPC communication for scalability of the system.
- Provided reporting solutions with PDF export format.
- Provided complete multi-threaded communication platform for visualization, recipe download, configuration, setup, utilities, maintenance.
- Provided system security with task based permissions
- Task based security privileges to users and roles.
- Implemented Alarms Web service, Messages Web service, Error Report Web Service.
- Experience in developing applications using Eclipse IDE and running test cases
- Analyzed, created and proposed remediation measures to fix the bugs in the application.
- Developed action classes and configuration files for struts framework.
- Developed Oracle stored procedures and complex SQL.
- Created and maintained the PL/SQL procedures that run in the background freeing the current threads from database deadlocks
- Tested the application thoroughly in Unit Testing phase using JUNIT
Environment: Java, XML-RPC, Web services, Struts, JSP, Oracle, JUnit, JDBC, UML, Eclipse, PL/SQL, Core Java, Eclipse