We provide IT Staff Augmentation Services!

Senior Hadoop Developer Resume

0/5 (Submit Your Rating)

FloridA

SUMMARY:

  • 6+ years of professional experience in IT in Analysis, Design, Development, Testing, Documentation, Deployment, Integration, and Maintenance of web based and Client/Server applications using Java and Big Data technologies.
  • 5 years of experience in Application Development and Data Management usingHadoopand related Big Data technologies such as HBASE, HIVE, PIG, FLUME, OOZIE, SQOOP, KAFKA,SPARK and ZOOKEEPER
  • In - depth Knowledge of Data Structures, Design and Analysis of Algorithms and having good understanding of Data Mining and Machine Learning techniques
  • Excellent knowledge onHadoopArchitecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node
  • Well versed in installation, configuration, supporting and managing of Big Data and underlying infrastructure ofHadoopCluster
  • HavingHadoop/Big Data related technology experience in Storage, Querying, Processing and analysis of data
  • Proficient in design and development of Map Reduce Programs using ApacheHadoopfor analyzing the big data as per the requirement
  • Hands on experience in installing, configuring, and usingHadoopecosystem components like HDFS, MapReduce, HBase, Zookeeper, Oozie, Hive, Sqoop, Pig, and Flume.
  • Experience with distributed systems, large-scale non-relational data stores, MapReduce systems, data modeling, and big data systems
  • Good knowledge in using Job scheduling and workflow designing tools like Oozie.
  • Installed and configured MapReduce, HIVE and the HDFS; implemented CDH3 Hadoop cluster on CentOS. Assisted with performance tuning and monitoring
  • Good understanding of NoSQL databases such as HBase, Cassandra and MongoDB.
  • Extensive knowledge on data serialization techniques like AVRO, sequence file
  • Experience in monitoring and controlling large-scale cloud (AWS) infrastructure
  • Extending Hive and Pig core functionality by writing custom UDFs.
  • Developed Python scripts to format and create daily transmission files
  • Worked extensively with Dimensional modeling, Data migration, Data cleansing, Data profiling, and ETL Processes features for data warehouses
  • Sound Relational Database Concepts and extensively worked with ORACLE, MySQL, DB2 and SQL Server
  • Good understanding of ECMP, Multi-Path Networking, Solid understanding of TCP/IP networking; socket programming.
  • Experience in Communication protocols like TCP/IP, HTTP.
  • Good Experience with databases, writing complex queries and stored procedures using SQL and PL/SQL
  • Expertise in Web Services architecture in SOAP and WSDL using JAX-RPC
  • Expertise in using configuration management tool like Sub Version (SVN),Rational Clear case, CVS and Git for version controlling
  • Hands on experience in developing web application using Spring Framework web module and integration with Struts MVC framework
  • Excellent experience with DW Concepts such as Star schema, Snowflake Schema, Fact and dimensional tables
  • Have good understanding of SCDs (slowly changing dimensions) like SCD1,SCD2 and SCD3
  • Experience in Full-stack development (Java, Scala, Python, etc)
  • Developed a scalable, cost effective, and fault tolerant data ware house system on Amazon EC2 Cloud.

WORK EXPERIENCE:

Senior Hadoop Developer

Confidential - Florida

Responsibilities:

  • Involved in complete SDLC of project includes requirements gathering, design documents, development, testing and production environments.
  • Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data.
  • Requirement gathering, analysis and coordinating onsite and offshore team members.
  • Developed Java Map Reduce programs on mainframe data to transform into structured way
  • Worked on MongoDB by using CRUD (Create, Read, Update and Delete), Indexing, Replication and Sharing features.
  • Performed data analysis in Hive by creating tables, loading it with data and writing hive queries which will run internally in a MapReduce way.
  • Real time streaming the data using Spark with Kafka.
  • Created Hive External tables and loaded the data in to tables and query data using HQL
  • Developed optimal strategies for distributing the mainframe data over the cluster. Importing and exporting the stored mainframe data into HDFS and Hive.
  • Implemented Hive Generic UDF's to in corporate business logic into Hive Queries
  • Implemented Spark using Python (pySpark) and SparkSQL for faster testing and processing of data
  • Implemented Hbase API to store the data into Hbase table from hive tables
  • Writing Hive queries for joining multiple tables based on business requirement
  • Monitored workload, job performance and capacity planning using Cloudera Manager

BigData Developer

Confidential

Responsibilities:

  • Involved in Installing, ConfiguringHadoopecosystem, and Cloudera Manager using CDH3 Distribution.
  • Involved in creating Hive tables, loading the data and writing hive queries that will run internally in MapReduce
  • Involved in writing MapReduce jobs
  • Real time streaming the data using Spark with Kafka
  • Responsible for developing data pipeline using flume, Sqoop and pig to extract the data from weblogs and store in HDFS
  • Installed and configured Hive and also written Hive UDFs
  • Developed data pipeline using Flume, Sqoop, Pig and Java MapReduce to ingest customer behavioral data and financial histories into HDFS for analysis
  • Used Pig to do transformations, event joins, filter boot traffic and some pre-aggregations before storing the data onto HDFS
  • Extracted and updated the data into MongoDB using MongoDB import and export command line utility interface
  • Supported MapReduce programs those are running on the cluster.
  • Involved in developing Pig Scripts for data change capture and delta record processing between newly arrived data and already existing data in HDFS
  • Designed and Developed Dashboards using Tableau
  • Involved in pivoting the HDFS data from Rows to Columns and Columns to Rows

Environment: Hadoop, MapReduce, MongoDB, Yarn, Hive, Pig, HBase, Spark, Kafka, Tableau, Oozie, Sqoop, Flume, Oracle 11g, Core Java, Cloudera, HDFS, Eclipse

BigData Developer

Confidential

Responsibilities:

  • Involved in Installing, Configuring Hadoopecosystem, and Cloudera Manager using CDH3 Distribution.
  • Experienced in managing and reviewing Hadoop log files
  • Experienced in running Hadoop streaming jobs to process terabytes of xml format data
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Supported Map Reduce Programs those are running on the cluster
  • Importing and Exporting of data from RDBMS to HDFS using Sqoop.
  • Installed and configured Hive and also written Hive UDFs.
  • Involved in creating Hive tables, loading the data and writing hive queries which will run internally in map reduce.
  • Written Hive queries for data to meet the business requirements.
  • Analyzed the data using Pig and written Pig scripts by grouping, joining and sorting the data.
  • Hands on experience with NoSQL Database.
  • Worked on MongoDB by using CRUD (Create, Read, Update and Delete), Indexing, Replication and Sharding features.
  • Participate in requirement gathering and analysis phase of the project in documenting the business requirements by conducting workshops/meetings with various business users.
  • Designed and Developed Dashboards using Tableau.
  • Actively participated in weekly meetings with the technical teams to review the code.

Software Engineer

Confidential

Responsibilities:

  • Developed various Java classes and SQL queries to retrieve and manipulate the data.
  • Created Use case, Sequence diagrams, functional specifications and User Interface diagrams using Star UML.
  • Involved in complete requirement analysis, design, coding and testing phases of the project.
  • Analysis of business requirements and gathering the requirements.
  • Developed the code which will create XML files and Flat files with the data retrieved from Databases and XML files.
  • Implemented Queries using SQL.
  • Development of complex SQL queries and stored procedures to process and store the data.
  • Involved in unit testing and bug fixing.

We'd love your feedback!