Senior Hadoop Developer Resume
0/5 (Submit Your Rating)
FloridA
SUMMARY:
- 6+ years of professional experience in IT in Analysis, Design, Development, Testing, Documentation, Deployment, Integration, and Maintenance of web based and Client/Server applications using Java and Big Data technologies.
- 5 years of experience in Application Development and Data Management usingHadoopand related Big Data technologies such as HBASE, HIVE, PIG, FLUME, OOZIE, SQOOP, KAFKA,SPARK and ZOOKEEPER
- In - depth Knowledge of Data Structures, Design and Analysis of Algorithms and having good understanding of Data Mining and Machine Learning techniques
- Excellent knowledge onHadoopArchitecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node
- Well versed in installation, configuration, supporting and managing of Big Data and underlying infrastructure ofHadoopCluster
- HavingHadoop/Big Data related technology experience in Storage, Querying, Processing and analysis of data
- Proficient in design and development of Map Reduce Programs using ApacheHadoopfor analyzing the big data as per the requirement
- Hands on experience in installing, configuring, and usingHadoopecosystem components like HDFS, MapReduce, HBase, Zookeeper, Oozie, Hive, Sqoop, Pig, and Flume.
- Experience with distributed systems, large-scale non-relational data stores, MapReduce systems, data modeling, and big data systems
- Good knowledge in using Job scheduling and workflow designing tools like Oozie.
- Installed and configured MapReduce, HIVE and the HDFS; implemented CDH3 Hadoop cluster on CentOS. Assisted with performance tuning and monitoring
- Good understanding of NoSQL databases such as HBase, Cassandra and MongoDB.
- Extensive knowledge on data serialization techniques like AVRO, sequence file
- Experience in monitoring and controlling large-scale cloud (AWS) infrastructure
- Extending Hive and Pig core functionality by writing custom UDFs.
- Developed Python scripts to format and create daily transmission files
- Worked extensively with Dimensional modeling, Data migration, Data cleansing, Data profiling, and ETL Processes features for data warehouses
- Sound Relational Database Concepts and extensively worked with ORACLE, MySQL, DB2 and SQL Server
- Good understanding of ECMP, Multi-Path Networking, Solid understanding of TCP/IP networking; socket programming.
- Experience in Communication protocols like TCP/IP, HTTP.
- Good Experience with databases, writing complex queries and stored procedures using SQL and PL/SQL
- Expertise in Web Services architecture in SOAP and WSDL using JAX-RPC
- Expertise in using configuration management tool like Sub Version (SVN),Rational Clear case, CVS and Git for version controlling
- Hands on experience in developing web application using Spring Framework web module and integration with Struts MVC framework
- Excellent experience with DW Concepts such as Star schema, Snowflake Schema, Fact and dimensional tables
- Have good understanding of SCDs (slowly changing dimensions) like SCD1,SCD2 and SCD3
- Experience in Full-stack development (Java, Scala, Python, etc)
- Developed a scalable, cost effective, and fault tolerant data ware house system on Amazon EC2 Cloud.
WORK EXPERIENCE:
Senior Hadoop Developer
Confidential - Florida
Responsibilities:
- Involved in complete SDLC of project includes requirements gathering, design documents, development, testing and production environments.
- Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data.
- Requirement gathering, analysis and coordinating onsite and offshore team members.
- Developed Java Map Reduce programs on mainframe data to transform into structured way
- Worked on MongoDB by using CRUD (Create, Read, Update and Delete), Indexing, Replication and Sharing features.
- Performed data analysis in Hive by creating tables, loading it with data and writing hive queries which will run internally in a MapReduce way.
- Real time streaming the data using Spark with Kafka.
- Created Hive External tables and loaded the data in to tables and query data using HQL
- Developed optimal strategies for distributing the mainframe data over the cluster. Importing and exporting the stored mainframe data into HDFS and Hive.
- Implemented Hive Generic UDF's to in corporate business logic into Hive Queries
- Implemented Spark using Python (pySpark) and SparkSQL for faster testing and processing of data
- Implemented Hbase API to store the data into Hbase table from hive tables
- Writing Hive queries for joining multiple tables based on business requirement
- Monitored workload, job performance and capacity planning using Cloudera Manager
BigData Developer
Confidential
Responsibilities:
- Involved in Installing, ConfiguringHadoopecosystem, and Cloudera Manager using CDH3 Distribution.
- Involved in creating Hive tables, loading the data and writing hive queries that will run internally in MapReduce
- Involved in writing MapReduce jobs
- Real time streaming the data using Spark with Kafka
- Responsible for developing data pipeline using flume, Sqoop and pig to extract the data from weblogs and store in HDFS
- Installed and configured Hive and also written Hive UDFs
- Developed data pipeline using Flume, Sqoop, Pig and Java MapReduce to ingest customer behavioral data and financial histories into HDFS for analysis
- Used Pig to do transformations, event joins, filter boot traffic and some pre-aggregations before storing the data onto HDFS
- Extracted and updated the data into MongoDB using MongoDB import and export command line utility interface
- Supported MapReduce programs those are running on the cluster.
- Involved in developing Pig Scripts for data change capture and delta record processing between newly arrived data and already existing data in HDFS
- Designed and Developed Dashboards using Tableau
- Involved in pivoting the HDFS data from Rows to Columns and Columns to Rows
Environment: Hadoop, MapReduce, MongoDB, Yarn, Hive, Pig, HBase, Spark, Kafka, Tableau, Oozie, Sqoop, Flume, Oracle 11g, Core Java, Cloudera, HDFS, Eclipse
BigData Developer
Confidential
Responsibilities:
- Involved in Installing, Configuring Hadoopecosystem, and Cloudera Manager using CDH3 Distribution.
- Experienced in managing and reviewing Hadoop log files
- Experienced in running Hadoop streaming jobs to process terabytes of xml format data
- Load and transform large sets of structured, semi structured and unstructured data.
- Supported Map Reduce Programs those are running on the cluster
- Importing and Exporting of data from RDBMS to HDFS using Sqoop.
- Installed and configured Hive and also written Hive UDFs.
- Involved in creating Hive tables, loading the data and writing hive queries which will run internally in map reduce.
- Written Hive queries for data to meet the business requirements.
- Analyzed the data using Pig and written Pig scripts by grouping, joining and sorting the data.
- Hands on experience with NoSQL Database.
- Worked on MongoDB by using CRUD (Create, Read, Update and Delete), Indexing, Replication and Sharding features.
- Participate in requirement gathering and analysis phase of the project in documenting the business requirements by conducting workshops/meetings with various business users.
- Designed and Developed Dashboards using Tableau.
- Actively participated in weekly meetings with the technical teams to review the code.
Software Engineer
Confidential
Responsibilities:
- Developed various Java classes and SQL queries to retrieve and manipulate the data.
- Created Use case, Sequence diagrams, functional specifications and User Interface diagrams using Star UML.
- Involved in complete requirement analysis, design, coding and testing phases of the project.
- Analysis of business requirements and gathering the requirements.
- Developed the code which will create XML files and Flat files with the data retrieved from Databases and XML files.
- Implemented Queries using SQL.
- Development of complex SQL queries and stored procedures to process and store the data.
- Involved in unit testing and bug fixing.