Sr. Hadoop/spark/big Data Developer Resume
Detroit, MI
SUMMARY:
- Over 8+ years of experience in IT industry which includes 3.5 years of experience in Big Data and Hadoop Ecosystem components like Spark, Pig, Hive, Sqoop, Oozie, Java, MapReduce, HBase, Cassandra, Kafka, Zookeeper and Flume.
- Excellent understanding and knowledge of Big Data and Hadoop architecture.
- Hands on experience using YARN and tools like Pig and Hive for data analysis, Sqoop for data ingestion, Oozie for scheduling and Zookeeper for coordinating cluster resources.
- Good understanding of HDFS Designs, Daemons, HDFS high availability (HA).
- Excellent understanding of Hadoop architecture and different demons of Hadoop clusters which include Resource Manager, Node Manager, Name Node and Data Node.
- Experience in working with Hadoop in Stand - alone, pseudo and distributed modes.
- Experience in writing Pig Latin scripts to sort, group, join and filter the data as part of data transformation as per the business requirements.
- Expert in working with Hive data warehouse creating tables, data distribution by implementing Partitioning and Bucketing.
- Expertise in implementing Ad-hoc queries using Hive QL.
- Experience with NoSQL databases like HBase and Cassandra as well as other ecosystems like ZooKeeper, Oozie, Impala, Storm, Spark Streaming, SparkSQL, Kafka and Flume.
- Extending HIVE and PIG core functionality by using custom UDFs in Java.
- Experience in importing and exporting data using Sqoop from HDFS (Hive & HBase) to Relational Database Systems (Oracle & TeraData) and vice-versa.
- Hands on experience in setting up workflow using Apache Oozie workflow engine for managing and scheduling Hadoop jobs using Oozie-Coordinator.
- Hands on experience of UNIX and shell scripting.
- Good understanding on SQL, database concepts and data warehouse Technologies like Informatica, Talend.
- Hands on experience on AWS infrastructure services Amazon Simple Storage Service(Amazon S3), Amazon Elastic Compute Cloud (Amazon EC2) and Elastic MapReduce(EMR).
- Extensive experience in Requirements gathering, Analysis, Design, Reviews, Coding and Code Reviews, Unit and Integration Testing.
- Extensive experience using Java, JEE, J2EE design Patterns like Singleton, Factory, MVC, Front Controller, for reusing most effective and efficient strategies.
- Expertise in using IDE like WebSphere (WSAD), Eclipse, NetBeans, My Eclipse, WebLogic Workshop.
- Experience in developing service components using JDBC.
- Experience in developing and designing Web Services (SOAP and Restful Web services).
- Experience in developing Web Interface using Servlets, JSP and Custom Tag Libraries
- Good amount of experience in developing applications using SCRUM methodology.
- Absolute knowledge of software development life cycle (SDLC), database design, RDBMS, data warehouse.
- Experience in writing Complex SQL Queries involving multiple tables inner and outer joins.
- Excellent interpersonal and communication skills, creative, research-minded, technically competent and result-oriented with problem solving and leadership skills.
TECHNICAL SKILLS:
Bigdata Technologies: Hadoop, MapReduce, HDFS, Hive,Pig, Zookeeper, Sqoop,Oozie, Flume, IMPALA, HBASE, Kafka, Storm.
Big Data Frameworks: HDFS, YARN, Spark.
Hadoop Distributions: Cloudera(CDH3,CDH4,CDH5),Hortonworks, Amazon EMR, EC2.
Programming Languages: Java, shell scripting, Scala.
Databases: RDBMS, MySQL, Oracle, Microsoft SQL Server, Teradata, DB2, PL/SQL,CASSANDRA, MongoDB.
IDE and Tools: Eclipse, NetBeans, Tableau.
Operating System: Windows, Linux/Unix.
Frameworks: Spring, Hibernate, JSF, EJB, JMS.
Scripting Languages: JSP & Servlets, JavaScript, XML, HTML, Python.
Application Servers: Apache Tomcat, Web Sphere, Web logic, JBoss.
Methodologies: Agile, SDLC, Waterfall.
Web Services: Restful, SOAP.
ETL Tools: Talend, Informatica.
Others: Solr, elasticsearch.
PROFESSIONAL EXPERIENCE:
Confidential - Detroit, MI
Sr. Hadoop/Spark/Big Data Developer
Responsibilities:
- Worked on analyzingHadoopcluster using different big data analytic tools including Flume, Pig, Hive, HBase, Oozie, Zookeper, Sqoop, Spark and Kafka.
- Developed Spark code using scala and Spark-SQL/Streaming for faster testing and processing of data.
- Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
- Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark -SQL, Data Frame, Pair RDD's, Spark YARN.
- Experienced with batch processing of data sources using Apache Spark, Elastic search.
- Developed analytical components using Scala, Spark, Apache Mesos and Spark Stream.
- Experienced with NoSQL databases like HBase, MongoDB and Cassandra.
- Installed Hadoop, Map Reduce, HDFS and developed multiple MapReduce jobs in PIG and Hive for data cleaning and pre-processing.
- Developed Kafka producer and consumers, Spark and Hadoop MapReduce jobs.
- Import the data from different sources like HDFS/Hbase into Spark RDD.
- Involved in converting Map Reduce programs into Spark transformations using Spark RDD's on Scala.
- Developed Spark scripts by using Scala Shell commands as per the requirement.
- Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, and loaded final data into HDFS.
- Load the data into Spark RDD and do in memory data Computation to generate the Output response.
- Loading Data into Hbase using Bulk Load and Non-bulk load.
- Experience on AWS infrastructure services Amazon Simple Storage Service(Amazon S3), Amazon Elastic Compute Cloud (Amazon EC2) and Elastic MapReduce(EMR).
- Experience in Oozie and workflow scheduler to manage hadoop jobs with control flows.
- Expertise in different data Modeling and Data Warehouse design and development.
Environment: Hadoop, HDFS, Spark, MapReduce, Pig, Hive, Sqoop, Kafka, HBase, Oozie, Flume, Scala, Python, Java, SQL Scripting and Linux Shell Scripting, Cloudera, Cloudera Manager, EC2, EMR, S3.
Confidential - Dallas, TX
Sr. Big data/Hadoop Developer
Responsibilities:
- Executing Hive queries and Pig Scripts.
- Proactively monitored systems and services, architecture design and implementation of Hadoop deployment configuration management, backup, and disaster recovery systems and procedures.
- Involved in defining job flows, managing and reviewing log files.
- Supported Map Reduce Programs those are running on the cluster.
- As a Big Data Developer implemented solutions for ingesting data from various sources and processing the Data-at-Rest utilizing Big Data technologies such as Hadoop, MapReduce Frameworks, MongoDB, Hive, Oozie, Flume, Sqoop and Talend etc.
- Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
- Imported Bulk Data into MongoDB Using Map Reduce programs.
- Developed and written Apache PIG scripts and HIVE scripts to process the HDFS data.
- Perform analytics on Time Series Data exists in MongoDB using MongoDB API.
- Designed and implemented Incremental Imports into Hive tables.
- Developed data integration using Talend to collect the information in database and save for the future use.
- Involved in Unit testing and delivered Unit test plans and results documents using Junit and MRUnit.
- Involved in collecting, aggregating and moving data from servers to HDFS using Apache Flume and Talend.
- Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
- Wrote multiple Java programs to pull data from MongoDB.
- Involved with File Processing using Pig Latin.
- Involved in creating Hive tables, loading with data and writing Hive queries that will run internally in map reduce way.
- Experience in optimization of Map reduce algorithm using combiners and partitions to deliver the best results and worked on Application performance optimization for a HDFS cluster.
- Implemented business logic by writing UDFs in Java and used various UDFs from Piggybanks and other sources.
- Worked on debugging, performance tuning of Hive & Pig Jobs.
- Used Hive to find correlations between customer's browser logs in different sites and analyzed them to build risk profile for such sites.
- Created and maintained Technical documentation for launching Hadoop Clusters and for executing Hive queries and Pig Scripts.
Environment: Java, Hadoop, Map Reduce, Pig, Hive, Linux, Sqoop, Storm, Flume, AWS EC2, EMR and Hortonworks data platform.
Confidential - Columbus, OH
Sr. Hadoop Developer
Responsibilities:
- Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables.
- Involved in data ingestion into HDFS using Sqoop from variety of sources using the connectors like jdbc and import parameters.
- Responsible for managing data from various sources and their metadata.
- Worked with NoSQL database Hbase to create tables and store data.
- Designed and implemented MapReduce-based large-scale parallel relation-learning system.
- Worked with NoSQL databases like Hbase in creating Hbase tables to load large sets of semi structured data coming from various sources.
- Installed and configured Hive and also wrote Hive UDF's that helped spot market trends.
- Used Hadoop streaming to process terabytes data in XML format.
- Involved in loading data from UNIX file system to HDFS.
- Implemented Fair schedulers on the Job tracker with appropriate parameters to share the resources of the Cluster for the Map Reduce jobs given by the users.
- Involved in creating Hive tables, loading the data using it and in writing Hive queries to analyze the data.
- Gained very good business knowledge on different category of products and designs within.
Environment: CDH4 with Hadoop, HDFS, Pig, Hive, Hbase, zookeeper, MapReduce, Java, Sqoop, Oozie, Apache Hadoop, Kafka, Storm, HBase, Logstash, Elastic search, Kibana, core Java, Storm, Kafka, Elastic Search, Redis, Flume, Scoop, Linux, UNIX Shell Scripting and Big Data.
Confidential - Raleigh, NC
Sr. Java developer
Responsibilities:
- Involved in the Analysis, Design, Implementation and Testing of Software Development Life Cycle (SDLC) of the project.
- Assisted the analysis team in performing the feasibility analysis of the project.
- Designed Use Case diagrams, Class diagrams and Sequence diagrams and Object Diagrams in the detailed design phase of the project using Rational Rose 4.0.
- Developed presentation layer of the project using HTML, JSP 2.0, JSTL and JavaScript technologies.
- Developed complete Business tier using Stateless and Stateful Session beans with EJB 2.0 standards using Web sphere Studio ApplicationDeveloper(WSAD 5.0).
- Used various J2EE design patterns, like DTO, DAO, and Business Delegate, Service Locator, Session Facade, Singleton and Factory patterns.
- Used Linux OS to convert the existing application to Windows.
- Consumed Web Service for transferring data between different applications.
- Integrated Spring DAO for data access using Hibernate.
- Created hibernate mapping files to map POJO to DB tables.
- Written complex SQL queries, stored procedures, functions and triggers in PL/SQL.
- Configured and used Log4J for logging all the debugging and error information.
- Developed Ant build scripts for compiling and building the project.
- Used IBM Websphere portal and IBM Websphere application server for deploying the applications.
- Used CVS Repository for Version Control.
- Created test plans and JUnit test cases and test suite for testing the application.
- Good hands on UNIX commands, used to see the log files on the server.
- Assisted in Developing testing plans and procedures for unit test, system test, and acceptance test.
- Unit test case preparation and Unit testing as part of the development.
- Used Log4J components for logging. Perform daily monitoring of log files and resolve issues.
Environment: Java1.4.1, JSP 2.0, HTML, JavaScript, EJB 2.0, Struts 1.1, JDBC 2.0, IBM Web Sphere 5.0, XML, XSLT, XML Schema, JUnit 3.8.1, Rational Rose 4.0, Ant 1.5, UML, Hibernate 3, Linux, Oracle 9i and Windows.
Confidential
Java DeveloperResponsibilities:
- Technical responsibilities included high level architecture and rapid development.
- Design architecture following J2EE MVC framework.
- Developed interfaces using HTML, JSP pages and Struts -Presentation View.
- Involved in designing & developing web-services using SOAP and WSDL.
- Developed and implemented Servlets running under JBoss.
- Used J2EE design Patterns for the Middle Tier development.
- Used J2EE design patterns and Data Access Object (DAO) for the business tier and integration Tier layer of the project.
- Created UML class diagrams that depict the code's design and its compliance with the functional requirements.
- Developed various EJBs for handling business logic and data manipulations from database.
- Designed and developed the UI using Struts view component, JSP, HTML, CSS and JavaScript.
- Implemented CMP entity beans for persistence of business logic implementation.
- Development of database interaction code to JDBC API making extensive use of SQL Query Statements and advanced prepared statement.
- Involved in writing Spring Configuration XML files that contains declarations and other dependent objects declaration.
- Inspection/Review of quality deliverables such as Design Documents.
- Involved in creation running of Test Cases for JUnit Testing.
- Experience in implementing Web Services using SOAP, REST and XML/HTTP technologies.
- Used Log4J to print the logging, debugging, warning, info on the server console.
- Wrote SQL Scripts, Stored procedures and SQL Loader to load data.
Environment: Java, J2EE, Spring, JSP, Hibernate, Java Script, CSS, JDBC, IntelliJ, LDAP, REST, Active Directory, SAML, Web Services, Microsoft SQL Server, HTML.