Hadoop Developer Resume
Charlotte, Nc
SUMMARY:
- Professional experience working with Java and Big Data technologies - Hadoop Ecosystem/ HDFS/ Map-Reduce Framework, NoSQL DB - HBase, HIVE, Sqoop.
- Experience in Hadoop stack, cluster architecture and monitoring the cluster.
- Experience in NoSQL databases like HBase, Cassandra, and MongoDB.
- Involved in all the phases of Software Development Life Cycle (SDLC): Requirements gathering, analysis, design, development, testing, production and post-production support.
- Well versed with developing and implementing MapReduce programs for analyzing Big Data with different file formats like structured and unstructured data.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and vice versa.
- Procedural knowledge in cleansing and analyzing data using HiveQL, Pig Latin, and custom MapReduce programs in Java.
- Experienced in writing custom UDFs and UDAFs for extending Hive and Pig functionalities.
- Ability to develop Pig UDF'S to pre-process the data for analysis.
- Skilled in creating workflows using Oozie.
- Experienced with Java API and REST to access HBase data.
- Hands on experience with supervised and unsupervised machine learning methods.
- Experience on working with batch processing/ Real time systems using various open source technologies like Hadoop, NoSQL DB's, and Storm etc.
- Collected data from different sources like web servers and social media for storing in HDFS and analyzing the data using other Hadoop technologies.
- Worked extensively with Dimensional modeling, Data migration, Data cleansing, Data profiling, and ETL Processes features for data warehouses.
- Good experience working with Horton works Distribution and Cloudera Distribution.
- Having good experience on Core Java in implementing OOP concepts, Multithreading, Collections, Exception handling.
- Involved in developing Social Media Analytics tools.
- Extensive experience in analyzing social media data and optimization of social media content.
- Experience in developing and deploying applications using Web Sphere Application Server, Tomcat, and Web Logic.
- Experienced in analyzing business requirements and translating requirements into functional and technical design specifications using UML.
- Good knowledge in Machine Learning Concepts by using Mahout and Mallet packages.
- Capable of processing large sets of structured, semi-structured, unstructured data, researches and applies machine learning methods, implements and leverages open source technologies.
- Hands-on experience in creating Ontologies.
- Experience in creating and using RDF, RDFS and OWL languages.
- Contributed the enhanced functionality towards open source products like Maven and Mahout.
- Implemented complex projects dealing with the considerable data size (GB/ PB) and with high complexity.
- Experienced in Editors/IDEs like Eclipse IDE, NetBeans IDE etc.
- Strong and effective problem-solving, analytical and interpersonal skills, besides being a valuable team player.
TECHNICAL SKILLS:
Languages: Java, C/C++, Assembly Language
Big Data Ecosystems: Hadoop, Map Reduce, HDFS, HBase, Zookeeper, Hive, Pig, Sqoop, Apache Storm. Kafka, Flume
Scripting Languages: JSP & Servlets, XML, HTML, JSON, JavaScript, jQuery, Web services.
Databases : NoSQL, Oracle, neo4j, Gruff
IDE : Eclipse, Net Beans
Application Servers : Apache Tomcat 5.x 6.0, Jboss 4.0
Toolkits and Packages : Mahout, Mallet and Stanford NLP
Data Visualization Tools : Gephi and Neo4J
Networking Protocols : SOAP, HTTP and TCP/IP
Operating System : Windows, Server, Linux
PROFESSIONAL EXPERIENCE:
Confidential, Charlotte, NCHadoop Developer
Responsibilities:
- Involved in full life-cycle of the project from Design, Analysis, logical and physical architecture modeling, development, Implementation, testing.
- Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data.
- Developed MapReduce programs to parse the raw data and store the refined data in tables.
- Designed and Modified Database tables and used HBASE Queries to insert and fetch data from tables.
- Involved in moving all log files generated from various sources to HDFS for further processing through Flume.
- Developed algorithms for identifying influencers with in specified social network channels.
- Developed and updated social media analytics dashboards on regular basis.
- Involved in loading and transforming large sets of structured, semi structured and unstructured data from relational databases into HDFS using Sqoop imports.
- Analyzing data with Hive, Pig and Hadoop Streaming.
- Responsible for analyzing and cleansing raw data by performing Hive queries and running Pig scripts on data.
- Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
- Created Hive tables, loaded data and wrote Hive queries that run within the map.
- Used OOZIE Operational Services for batch processing and scheduling workflows dynamically.
- Populated HDFS and Cassandra with huge amounts of data using Apache Kafka.
- Experienced in working with Apache Storm.
- Developed Pig Latin Scripts to pre-process the data.
- Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
- Involved in fetching brands data from social media applications like Facebook, twitter.
- Performed data mining investigations to find new insights related to customers.
- Involved in forecast based on the present results and insights derived from data analysis.
- Developed sentiment analysis system per particular domain using machine learning concepts by using supervised learning methodology.
- Involved in collecting the data and identifying data patterns to build trained model using Machine Learning.
- Create a complete processing engine, based on Cloudera's distribution, enhanced to performance.
- Manage and review Hadoop log files.
- Developed and generated insights based on brand conversations, which in turn helpful for effectively driving brand awareness, engagement and traffic to social media pages.
- Involved in identification of topics and trends and building context around that brand.
- Developed different formulas for calculating engagement on social media posts.
- Involved in the identifying, analyzing defects, questionable function error and inconsistencies in output.
- Involved in review technical documentation and provide feedback.
- Involved in fixing issues arising out of duration testing.
Environment: Java, NLP, HBase, Machine Learning, Hadoop, HDFS, Map Reduce, Hive, Apache Storm, Sqoop, Flume, Oozie, Apache Kafka, Zookeeper, MySQL, and eclipse
Confidential, Houston, TXHadoop Developer
Responsibilities:
- Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data.
- Load and transform large sets of structured, semi structured and unstructured data.
- Understand customer base, buying habits, promotional effectiveness and inventory management, buying decisions, by gathering and analyzing data through MapReduce jobs.
- Used Oozie Operational Services for batch processing and scheduling workflows dynamically.
- Extensively worked on creating End-End data pipeline orchestration using Oozie.
- Responsible for loading customer's data and event logs into HBase using Java API.
- Created HBase tables to store variable data formats of input data coming from different portfolios
- Involved in adding huge volumes of data in rows and columns to store data in HBase.
- Wrote MapReduce jobs to discover trends in data usage by users.
- Imported data using Sqoop to load data from RDBMS to HDFS on regular basis.
- Processed the source data to structured data and store in NoSQL database HBase.
- Design and develop JAVA API (Commerce API) which provides functionality to connect to the HBase through Java services.
- Created Hive tables to store the processed results in a tabular format.
- Analyzed large amount of data set to determine the optimal way to aggregate and report on it.
- Supported Map Reduce Programs those are running on the cluster.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Manage and review Hadoop log files.
- Support/Troubleshoot MapReduce programs running on the cluster
- Used Junit for unit testing
- Involved in fixing issues arising out of duration testing.
Environment: Java, Machine Learning, Mallet, Mahout, HBase, Machine Learning, Hadoop, HDFS, Map Reduce, Hive, pig, sqoop, HBase.
Confidential, El Segundo, CASenior Java Developer
Responsibilities:
- Designed and deployed sentiment analysis classifiers to predict the sentiment of the un-seen data using Maximum entropy and Naïve Bayes methods serving online and offline use cases at scale.
- Established process to use training sets developed by humans for classifiers, significantly scaling up innovation and deployment of targeted classifiers for specific business use cases.
- Involved in identifying the patterns for training the data.
- Involved in evaluating the trained model on test data.
- Achieved 90% of accuracy in sentiment analysis in retail domain.
- Developed trained models for retail, insurance, power domains.
- Involved in fixing issues which were find out of duration testing.
- Responsible for understanding the scope of the project and developed algorithm for identifying the influencers per brand across various social networking sites.
- Involved in fetching data of different users from social media applications like Facebook, twitter.
- Involved in upgrading the algorithm.
- Involved in exploring the different open source packages which in turn helps in generating sentiment analysis.
- Maintaining Project documentation for the module.
- Involved in the identifying, analyzing defects, questionable function error and inconsistencies in output.
- Involved in review technical documentation and provide feedback.
- Involved in fixing issues arising out of duration testing.
Environment: Java, Machine Learning, Mallet, Mahout, HBase, Machine Learning, Hadoop, HDFS, Map Reduce.
ConfidentialJava Developer
Responsibilities:
- Developed algorithms for calculation of influence of the user.
- Developed algorithms to generate niche users of particular user.
- Developed algorithm for providing recommendation to the particular user across different social networking applications.
- Involved in fetching data of a user from social media applications like face book, twitter.
- Designed and Modified Database tables and used HBASE Queries to insert and fetch data from tables.
- Involved in creating triples by using different OWL properties.
- Used different owl inference properties.
- Involved in generating data to find interest graph for the user.
- Involved in creating the ontology's for different social applications.
- Hands on experience in working with Allegro Graph server.
- Experience in working with OWL Lite, OWL DL and OWL FULL properties.
- Worked on graph visualization tools like Gephi, Gruff and Neo4j.
- Involved in building context around the user which intern used for giving recommendations.
- Maintaining Project documentation for the module.
- Involved in the identifying, analyzing defects, questionable function error and inconsistencies in output.
- Involved in review technical documentation and provide feedback.
- Involved in fixing issues arising out of duration testing.
Environment: Java, Gate, Allegro Graph Server, Gephi, Gruff, Hadoop, Hbase, Web Ontology Language, Machine Learning, Sparql, Hibernate, Rdf, MySQL, Structs2 and Tiles frame work.