Sr. Big Data Engineer Resume
Washington, DC
SUMMARY:
- Well round professional with 10+ years of experience in design, development, maintenance, and support of Java/J2EE and Confidential applications .
- Around 4+ years of experience in Confidential Ecosystem (Spark, HBase, Map reduce, Hive, Pig, Solr, Kafka, Flume, Sqoop) as Developer.
- Skilled in Map Reduce programming using JAVA, Implementation of XML SerDe, providing data summarization, query, and analysis of large datasets using Hive.
- Strong knowledge on Confidential architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, YARN, Resource Manager, Node Manager and Map Reduce programming paradigm.
- Extensiveexperience in designing of complex data flows using Streamsets.
- Familiar withNeo4j graph database and writing cypher queries
- Highly skilled in integrating Kafka with Spark streaming for high speed data processing.
- Exploring with the Spark for improving the performance and optimization of the existing algorithms in Confidential using Spark Context, Spark Streaming, Spark - SQL, Data Frame, Pair RDD’s, Spark YARN.
- Developed the processed data in HBase for faster querying and random access.
- Sound knowledge in using Apache SOLR to search against structured and un-structured data.
- Defined job flows using CAWA scheduler to automate the Confidential jobs .
- Experienced in import and export of data from traditional databases to Confidential .
- Familiarity with Confidential authentication and authorization Technical work flows.
- Evaluate Confidential infrastructure requirements and design/deploy solutions.
- Technology stack assessment implementing various proof of concepts (POC) to eventually adopt them to benefit from the Big Data Confidential initiative.
- Strong experience in applying Design Patterns and Development of multi-tier applications.
- Extensive experience in Java/J2EE programming - JDBC, Servlets, JSP, JSTL, JMS, EJB2.0/3.0 .
- Expert knowledge over J2EE Design Patterns like MVC Architecture, Front Controller, Session Facade, Business Delegate and Data Access Object for building J2EE Applications.
- Experienced in web development using HTML, DHTML, XHTML, CSS, JavaScript, AJAX and Angular JS technologies.
- Experienced in developing MVC framework based websites using JSF, Struts and Spring.
- Experience in building web applications using Spring Framework features like MVC (Model View Controller), AOP (Aspect Oriented Programming), IOC (Inversion Of Control) , DAO (Data Access Object) and template classes.
- Experience in creating and consuming Restful Web Services using JAX-RS (Jersey).
- Working knowledge in multi-tiered distributed environment, OOAD concepts, good understanding of Software Development Lifecycle (SDLC) and Service Oriented Architecture (SOA).
- Experience in working in evironments using Agile(SCRUM), RUP and Test Driven development methodologies.
- Experience in working in both Windows and Unix platforms including programming and debugging skills in Unix Shell Scripting .
- Extensive experience in developing Use Cases, Activity Diagrams, Sequence Diagrams and Class Diagrams using Visio.
- Good Knowledge of using IDE Tools like Eclipse, NetBeans, JBuilder, Rational Application Developer(RAD) for Java/J2EE application development.
- Experience in using Maven for build automation.
TECHNICAL SKILLS:
Hadoop Ecosystem: Map Reduce, Spark, HBase, Hive, Pig, Solr, Flume, Sqoop, Kafka, Oozie, Hue, Cloudera Manager, Streamsets, Neo4j
Programming Languages: Java, Groovy, Cypher
Java/JEE Technologies: JSP, Servlets, JDBC, JMS, EJB
Enterprise Frameworks: Spring, Struts, JSF, Hibernate.
Web Services: SOAP, Restful
Web/Application Server: Tomcat, Weblogic, WebSphere.
Web Technologies: HTML, JavaScript, CSS, JQuery, AJAX, Angular JS
Databases: Oracle, DB2, SQL Server.
IDEs: Eclipse, RAD, WASD, Net Beans.
Other Tools &packages: CVS, Clear Case, JUnit, Maven, ANT
SDLC Methodology: Agile, RUP, Waterfall model.
PROFESSIONAL EXPERIENCE
Confidential,Washington, DC
Sr. Big Data Engineer
Responsibilities:
- Worked on a live 120 nodes Confidential cluster running CDH5.7.4 .
- Exported and imported data between Confidential and relational databases using SQOOP.
- Designed many complex dataflow and written groovy code using Streamsets matching various business use cases.
- Loaded historical transactional data into Neo4j graph database by writing cypher queries.
- Used Hive to form an abstraction on top of structured data resides in HDFS and implemented Partitions, Dynamic Partitions on HIVE tables.
- Developed UDFs in Java as and when necessary to use in HIVE queries .
- Worked with Sequence and Parquet file formats for performance enhancement.
- Involved in writing Spark applications for reading data from Kafka topic and ingesting data into HBase tables.
- Written Map Reduce procedures to power data for extraction, transformation and aggregation from various xml files generated during different phases of a transaction.
- Indexed data in SOLR for faster query response and implemented a custom logic on retail and digital data which was required to business.
- Worked with Confidential admin in creating CAWA jobs to schedule and orchestrate the ETL process.
- Written complex Hive and SQL queries for data analysis to meet business requirements.
- Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and aggregation and how does it translate to Elastic MapReduce jobs
- Created Hive UDFs in Java, compiling them into jars and adding them to the HDFS and executing them with Hive Queries
- Involved in Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scala
- Involved in importing metadata into Hive using Scala and migrated existing tables and applications to work on Hive
- creating entities in Scala and Java along with named queries to interact with database
- Implemented Spark using Scala and SparkSQL for faster testing and processing of data
- Designed HBase schema to avoid Hot spotting and exposed the data from HBase tables to REST API on UI
- Worked on loading data from Confidential Cluster to Amazon S3 storage for processing in Amazon EMR.
- Part of team in plug-in for Confidential that provides the ability to use MongoDB as an input source and an output destination for MapReduce, Spark, Hive and Pig jobs
- Composing the application classes as Spring Beans using Spring IOC/Dependency Injection.
- Designed and Developed server side components using Java, REST, WSDL
- Worked on a POC to compare processing time of Impala with Apache Hive for batch applications to implement the former in project
- Coordinated with various support teams in process of many deployments.
- Involved in resolving data discrepancies, debugging issues, and performed data integrity checks between two data centers.
Environment: Cloudera Confidential CDH5.x, HDFS, Yarn, Hue, Cloudera Manager, Hive1.1.x, HBase, Spark1.6, SOLR 4.10.x, Kafka, Sqoop 1.99.x, Streamsets 2.1.x, 2.2.x, Neo4j 2.3.x, Java 1.7, Eclipse, CAWA, Oracle 12c, Maven, Map reduce, Oozie, MongoDB, SQL, Spring MVC, Spring IOC, GitHub
Confidential, Bellevue, WASr. Big Data Engineer
Responsibilities:
- Loaded data from Oracle database into HDFS.
- Developed Yarn MapReduce pipeline jobs to process the data and create necessary HFiles.
- Loaded the created HFiles into HBase for faster access of large customer base without taking Performance hit.
- Developed Simple to complex MapReduce Jobs using Hive and Pig.
- Implemented business logic by writing UDFs in Java and used various UDFs from Piggybanks and other sources.
- Performed unit testing of MapReduce jobs using MRUnit.
- Used Oozie scheduler system to automate the pipeline workflow.
- Worked on importing data from various sources and performed transformations using MapReduce, Pig to load data into HDFS.
- Worked with cross functional teams to design and develop a Big Data platform.
- Developed real time analytics data pipeline processing billions of events
- Worked with QA and DevOps teams to troubleshoot any issues that may arise during production
- Worked on compression mechanisms to optimize MapReduce Jobs.
- Solved small file problem using Sequence files processing in MapReduce.
- Experience in processing streaming data from Kafka and publishing data for real time segmenting and real time marketing.
- Configured Sqoop jobs to import data from RDBMS into HDFS using Oozie workflows.
- Written various HiveQL and Pig scripts.
- Developed Batch and Real Time Processing jobs using spark. Developed Spark Streaming applications for Real Time Processing.
- Created HBase tables to store variable data formats coming from different portfolios.
- Performed real time analytics on HBase using Spark API and Rest API. Python scripts were used for processing strings.
- Analyzed the customer behavior by performing click stream analysis and to ingest the data used flume.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Actively participated in software development lifecycle (scope, design, implement, deploy, test)
- including design and code reviews, test development, test automation.
- Worked in an agile environment.
Environment:CDH4.X,CDH5.X, MapReduce, Pig, Hive, HDFS, HBase, Avro, Oozie, Java 1.7, JIRA, Crucible, GitHub, Maven
Confidential,West Bloomfield Township, MISr. Hadoop /J2EE Developer
Responsibilities :
- Developed Managed, external and partition tables as per the requirement.
- Ingested structured data into appropriate schemas and tables to support the rule and analytics.
- Developing custom User Defined Functions (UDF's) in Hive to transform the large volumes of data with respect to business requirement.
- Developing Pig Scripts, Pig UDF's and Hive Scripts, Hive UDF's to load data files.
- Responsible for building scalable distributed data solutions using Confidential .
- Involved in loading data from edge node to HDFS using shell scripting
- Implemented scripts for loading data from UNIX file system to HDFS.
- Load and transform large sets of structured, semi structured and unstructured data.
- Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
- Automated work flow using shell scripts.
- Participated in SDLC Requirements gathering, Analysis, Design, Development and Testing of application developed using AGILE methodology.
- Actively participated in Object Oriented Analysis Design sessions of the Project, which is based on MVC Architecture using Spring Framework.
- Participated in Daily Scrum meetings, Sprint planning and estimation of the tasks for the user stories, participated in retrospective and presenting Demo at end of the sprint.
- Adopted J2EE design patterns like DTO, DAO, Command and Singleton.
- Used Spring IOC/ORM, AOP and Spring Security.
- Published Web services - WSDL and SOAP for getting required information from the PostgreSQL.
- Implemented Object-relation mapping in the persistence layer using Hibernate framework in conjunction with spring functionality.
- Prepared System design documents as per business owner specification.
- Involved in data modeling and functionality design.
- Developed User Interface mock ups for client demonstration
- Agile SCRUM methodology and participated in daily meetings
- Developed functionality as per SDDs using Java and J2EE
- Used Spring and AngularJS frameworks for developing the application.
- Used Spring framework to implement the Restful services.
- Used different features of Spring framework like IOC, AOP, Transaction Management and Security.
- Created the front end screens using AngularJS, HTML, CSS and Boot Strap
- Created PL/SQL scripts for stored procedures, triggers and views.
- Prepared JUnit Test cases and used Cobertura for code coverage
Environment: HDFS, Pig, Hive, UNIX Shell Scripting, PostgreSQL, Java 1.6, Spring 4.0, AngularJS and Restful, Eclipse , SVN , JIRA, Weblogic 11,Oracle 11g, PL/SQL, Junit 5, SONAR, SVN, UNIX.
Confidential,Paso Robles, CASr. Java/J2EE Developer
Responsibilities :
- Designed and developed web based UI using JSP, JQuery, AJAX, CSS, JSTL Tag libraries, HTML.
- Used Struts framework to implement the MVC design pattern.
- Developed various JavaScript function and events using JQuery to perform validations, AJAX calls to load data to and from controller.
- Developed unit test cases, functional test cases and test clients with JUnit.
- Used shell scripts to perform UNIX commands and check the logs to find out the run time errors.
- Used JDBC for the DAO layer and the database operations.
- Published and Consumed SOAP based web services, testing was done using SOAP UI.
- Used Log4J for tracking the logs for errors, warnings and info.
- Developed dynamic JSP pages with Struts. Used built-in/custom Interceptors and Validators of Struts.
- Used Spring-Mybatis integration to run PL/SQL queries, call procedures and packages to access the Oracle Database.
- Used Java Server Faces, Facelets to prepare logical parts of the presentation pages.
- Responsible for writing the unit test cases, system cases and functional test cases of the system.
- Preparing the high-level and detail level design documents before developing the code according to required specification.
- Tested the application in development and test environments and deployed on Jboss.
- Used SVN as version control system for source code and project documents, bug fixing, and tracking is done by Quality Center.
- Participate in business requirements and understand the requirements from Business Users.
- Participate in sprint planning and estimate the stories and define the tasks for the current sprint stories.
- Adopted Agile methods for development and delivery of the solution.
- Designed the Database, written triggers and stored procedures.
- Worked with Struts MVC objects like Action Servlet, Controlers, validators, Web Application Context, Handler Mapping, Message Resource Bundles, Form Controller, and JNDI for look-up for J2EE components.
- Implemented Validator framework of Struts to write customized JSP validations.
- Configured log4j to log the warning and error messages.
Environment: Java 1.5, Java 1.6, Struts 2, JDBC, JEE, XML, HTML, CSS, AJAX, JQuery 1.8, Web services, SOAP UI, JUnit, Quality Center, Eclipse 3.7, Toad, Web Logic 10.3
Confidential
Java developer
Responsibilities:
- Developed new features to the application by understanding the requirements from the business.
- Used Spring framework integrated with Security service to enable security for the application and services.
- Implemented the Spring MVC architecture.
- Configured Bean properties using setter injection.
- Worked extensively with JSP and Servlets to accommodate all presentation customizations on the front end.
- Developed JSP for the presentation layer.
- Created DML statements to insert/update the data in database and also created DDL statements to create/drop tables to/from oracle database.
- Configured Hibernate for storing objects in the database, retrieving objects, querying objects and persisting relationships between objects.
- Configured the hibernate.cfg files to connect to the database.
- Wrote DAO design pattern to retrieve & store data the data form Web Services and populate the user account information to admin for modifying or creating the alternate/secondary ids for the primary user id account.
- Used JUnit for unit testing of the application.
- Deployed EAR files using the build tools in the Web Logic application server.
Environment: Jdk1.6, JSP2.0, Spring, HTML, XML, WIN CVS, Hibernate and Weblogic.