We provide IT Staff Augmentation Services!

Hadoop Developer Resume

3.00/5 (Submit Your Rating)

Charlotte, Nc

SUMMARY:

  • Professional experience working with Java and Big Data technologies - Hadoop Ecosystem/ HDFS/ Map-Reduce Framework, NoSQL DB - HBase, HIVE, Sqoop.
  • Experience in Hadoop stack, cluster architecture and monitoring the cluster.
  • Experience in NoSQL databases like HBase, Cassandra, and MongoDB.
  • Involved in all the phases of Software Development Life Cycle (SDLC): Requirements gathering, analysis, design, development, testing, production and post-production support.
  • Well versed with developing and implementing MapReduce programs for analyzing Big Data with different file formats like structured and unstructured data.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and vice versa.
  • Procedural knowledge in cleansing and analyzing data using HiveQL, Pig Latin, and custom MapReduce programs in Java.
  • Experienced in writing custom UDFs and UDAFs for extending Hive and Pig functionalities.
  • Ability to develop Pig UDF'S to pre-process the data for analysis.
  • Skilled in creating workflows using Oozie.
  • Experienced with Java API and REST to access HBase data.
  • Hands on experience with supervised and unsupervised machine learning methods.
  • Experience on working with batch processing/ Real time systems using various open source technologies like Hadoop, NoSQL DB's, and Storm etc.
  • Collected data from different sources like web servers and social media for storing in HDFS and analyzing the data using other Hadoop technologies.
  • Worked extensively with Dimensional modeling, Data migration, Data cleansing, Data profiling, and ETL Processes features for data warehouses.
  • Good experience working with Horton works Distribution and Cloudera Distribution.
  • Having good experience on Core Java in implementing OOP concepts, Multithreading, Collections, Exception handling.
  • Involved in developing Social Media Analytics tools.
  • Extensive experience in analyzing social media data and optimization of social media content.
  • Experience in developing and deploying applications using Web Sphere Application Server, Tomcat, and Web Logic.
  • Experienced in analyzing business requirements and translating requirements into functional and technical design specifications using UML.
  • Good knowledge in Machine Learning Concepts by using Mahout and Mallet packages.
  • Capable of processing large sets of structured, semi-structured, unstructured data, researches and applies machine learning methods, implements and leverages open source technologies.
  • Hands-on experience in creating Ontologies.
  • Experience in creating and using RDF, RDFS and OWL languages.
  • Contributed the enhanced functionality towards open source products like Maven and Mahout.
  • Implemented complex projects dealing with the considerable data size (GB/ PB) and with high complexity.
  • Experienced in Editors/IDEs like Eclipse IDE, NetBeans IDE etc.
  • Strong and effective problem-solving, analytical and interpersonal skills, besides being a valuable team player.

TECHNICAL SKILLS:

Languages: Java, C/C++, Assembly Language

Big Data Ecosystems: Hadoop, Map Reduce, HDFS, HBase, Zookeeper, Hive, Pig, Sqoop, Apache Storm. Kafka, Flume

Scripting Languages: JSP & Servlets, XML, HTML, JSON, JavaScript, jQuery, Web services.

Databases : NoSQL, Oracle, neo4j, Gruff

IDE : Eclipse, Net Beans

Application Servers : Apache Tomcat 5.x 6.0, Jboss 4.0

Toolkits and Packages : Mahout, Mallet and Stanford NLP

Data Visualization Tools : Gephi and Neo4J

Networking Protocols : SOAP, HTTP and TCP/IP

Operating System : Windows, Server, Linux

PROFESSIONAL EXPERIENCE:

Confidential, Charlotte, NC 

Hadoop Developer

Responsibilities:

  • Involved in full life-cycle of the project from Design, Analysis, logical and physical architecture modeling, development, Implementation, testing.
  • Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data.
  • Developed MapReduce programs to parse the raw data and store the refined data in tables.
  • Designed and Modified Database tables and used HBASE Queries to insert and fetch data from tables.
  • Involved in moving all log files generated from various sources to HDFS for further processing through Flume.
  • Developed algorithms for identifying influencers with in specified social network channels.
  • Developed and updated social media analytics dashboards on regular basis.
  • Involved in loading and transforming large sets of structured, semi structured and unstructured data from relational databases into HDFS using Sqoop imports.
  • Analyzing data with Hive, Pig and Hadoop Streaming.
  • Responsible for analyzing and cleansing raw data by performing Hive queries and running Pig scripts on data.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
  • Created Hive tables, loaded data and wrote Hive queries that run within the map.
  • Used OOZIE Operational Services for batch processing and scheduling workflows dynamically.
  • Populated HDFS and Cassandra with huge amounts of data using Apache Kafka.
  • Experienced in working with Apache Storm.
  • Developed Pig Latin Scripts to pre-process the data.
  • Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
  • Involved in fetching brands data from social media applications like Facebook, twitter.
  • Performed data mining investigations to find new insights related to customers.
  • Involved in forecast based on the present results and insights derived from data analysis.
  • Developed sentiment analysis system per particular domain using machine learning concepts by using supervised learning methodology.
  • Involved in collecting the data and identifying data patterns to build trained model using Machine Learning.
  • Create a complete processing engine, based on Cloudera's distribution, enhanced to performance.
  • Manage and review Hadoop log files.
  • Developed and generated insights based on brand conversations, which in turn helpful for effectively driving brand awareness, engagement and traffic to social media pages.
  • Involved in identification of topics and trends and building context around that brand.
  • Developed different formulas for calculating engagement on social media posts.
  • Involved in the identifying, analyzing defects, questionable function error and inconsistencies in output.
  • Involved in review technical documentation and provide feedback.
  • Involved in fixing issues arising out of duration testing.

Environment: Java, NLP, HBase, Machine Learning, Hadoop, HDFS, Map Reduce, Hive, Apache Storm, Sqoop, Flume, Oozie, Apache Kafka, Zookeeper, MySQL, and eclipse

Confidential, Houston, TX 

Hadoop Developer

Responsibilities:

  • Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Understand customer base, buying habits, promotional effectiveness and inventory management, buying decisions, by gathering and analyzing data through MapReduce jobs.
  • Used Oozie Operational Services for batch processing and scheduling workflows dynamically.
  • Extensively worked on creating End-End data pipeline orchestration using Oozie.
  • Responsible for loading customer's data and event logs into HBase using Java API.
  • Created HBase tables to store variable data formats of input data coming from different portfolios
  • Involved in adding huge volumes of data in rows and columns to store data in HBase.
  • Wrote MapReduce jobs to discover trends in data usage by users.
  • Imported data using Sqoop to load data from RDBMS to HDFS on regular basis.
  • Processed the source data to structured data and store in NoSQL database HBase.
  • Design and develop JAVA API (Commerce API) which provides functionality to connect to the HBase through Java services.
  • Created Hive tables to store the processed results in a tabular format.
  • Analyzed large amount of data set to determine the optimal way to aggregate and report on it.
  • Supported Map Reduce Programs those are running on the cluster.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Manage and review Hadoop log files.
  • Support/Troubleshoot MapReduce programs running on the cluster
  • Used Junit for unit testing
  • Involved in fixing issues arising out of duration testing.

Environment: Java, Machine Learning, Mallet, Mahout, HBase, Machine Learning, Hadoop, HDFS, Map Reduce, Hive, pig, sqoop, HBase.

Confidential, El Segundo, CA

Senior Java Developer

Responsibilities:

  • Designed and deployed sentiment analysis classifiers to predict the sentiment of the un-seen data using Maximum entropy and Naïve Bayes methods serving online and offline use cases at scale.
  • Established process to use training sets developed by humans for classifiers, significantly scaling up innovation and deployment of targeted classifiers for specific business use cases.
  • Involved in identifying the patterns for training the data.
  • Involved in evaluating the trained model on test data.
  • Achieved 90% of accuracy in sentiment analysis in retail domain.
  • Developed trained models for retail, insurance, power domains.
  • Involved in fixing issues which were find out of duration testing.
  • Responsible for understanding the scope of the project and developed algorithm for identifying the influencers per brand across various social networking sites.
  • Involved in fetching data of different users from social media applications like Facebook, twitter.
  • Involved in upgrading the algorithm.
  • Involved in exploring the different open source packages which in turn helps in generating sentiment analysis.
  • Maintaining Project documentation for the module.
  • Involved in the identifying, analyzing defects, questionable function error and inconsistencies in output.
  • Involved in review technical documentation and provide feedback.
  • Involved in fixing issues arising out of duration testing.

Environment: Java, Machine Learning, Mallet, Mahout, HBase, Machine Learning, Hadoop, HDFS, Map Reduce.

Confidential

Java Developer

Responsibilities:

  • Developed algorithms for calculation of influence of the user.
  • Developed algorithms to generate niche users of particular user.
  • Developed algorithm for providing recommendation to the particular user across different social networking applications.
  • Involved in fetching data of a user from social media applications like face book, twitter.
  • Designed and Modified Database tables and used HBASE Queries to insert and fetch data from tables.
  • Involved in creating triples by using different OWL properties.
  • Used different owl inference properties.
  • Involved in generating data to find interest graph for the user.
  • Involved in creating the ontology's for different social applications.
  • Hands on experience in working with Allegro Graph server.
  • Experience in working with OWL Lite, OWL DL and OWL FULL properties.
  • Worked on graph visualization tools like Gephi, Gruff and Neo4j.
  • Involved in building context around the user which intern used for giving recommendations.
  • Maintaining Project documentation for the module.
  • Involved in the identifying, analyzing defects, questionable function error and inconsistencies in output.
  • Involved in review technical documentation and provide feedback.
  • Involved in fixing issues arising out of duration testing.

Environment: Java, Gate, Allegro Graph Server, Gephi, Gruff, Hadoop, Hbase, Web Ontology Language, Machine Learning, Sparql, Hibernate, Rdf, MySQL, Structs2 and Tiles frame work.

We'd love your feedback!