We provide IT Staff Augmentation Services!

Sr. Big Data/hadoop Engineer Resume

2.00/5 (Submit Your Rating)

Boston, MA

SUMMARY

  • Around 9+years of professional IT experience which includes experience in BigData ecosystem and Java/J2EE related technologies.
  • Hands on experience Technical Designing and Developing BIG DATA solutions using Hadoop Ecosystems.
  • Experience in working with Flume to load the log data from multiple sources directly into HDFS.
  • Experience in importing and exporting structured and unstructured data using Sqoop from HDFS to Relational Database and vice - versa.
  • Involved in HDFS upkeep and stacking of organized and unstructured data.
  • Worked and learned a great deal from AmazonWebServices (AWS) Cloudservices like EC2, S3, EBS, RDS and VPC.
  • Set up of Cluster servers on AWS and management of cluster servers.
  • Experienced with migrating data to and from RDBMS and unstructured sources into HDFS using Sqoop.
  • Handling data in various file formats such as Sequential, AVRO, RC, Parquet and ORC.
  • Experienced working on EC2 (Elastic Compute Cloud) cluster instances, setup data buckets on S3 (Simple Storage Service), set EMR (ElasticMapReduce).
  • Experienced in extending Hive and Pig core functionality by writing custom UDFs using Java. years of experiences in big data development with Hadoop ecosystem, Hadoop, Spark, Hive, HDFS, HBase, Cassandra, Sqoop, Flume, Kafka, Zookeeper, and Oozie
  • Worked on real-time, in-memory processing engines such as Spark, Impala and integration with BI Tools such as Tableau.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
  • Experience using different file formats like Binary, XML, JSON and CSVfiles. Experience in developing solutions to analyze large data sets efficiently.
  • Extensive experience in developing MYSQL, DB2 and Oracle Database Triggers, StoredProcedures and Packages within quality standards using SQL and PL/SQL.
  • Intensive work experience in developing enterprise solutions using Java, J2EE, Agile, Servlets, JSP,JDBC, Struts, spring, Hibernate, JavaBeans,JSF, MVC, JMS.
  • Comprehensive knowledge in Debugging, Optimizing and Performance Tuning of DB2, Oracle and MYSQL databases.
  • Experience with Cloudera and HortonWorks distributions.
  • Expertise in CoreJava, J2EE, Multithreading, JDBC, ShellScripting and proficient in using JavaAPI's for application development.
  • Experienced in using IDEs and Tools like Eclipse, NetBeans, GitHub, Jenkins, Maven and IntelliJ.
  • Extensive Experienced in Unit Testing with JUnit,MRUnit, Pytest
  • Experience using middleware architecture using Sun Java technologies like J2EE, JSP, Servlets, and application servers like WebSphere and Weblogic.
  • Experience in Object Oriented Analysis, Design (OOAD) and development of software using UMLMethodology, good knowledge of J2EE design patterns and CoreJava design patterns.
  • Good Inter personnel skills and ability to work as part of a team. Exceptional ability to learn and master new technologies and to deliver outputs in short deadlines.

TECHNICAL SKILLS

Big data/Hadoop: HDFS, Map Reduce, Hive, Pig, Sqoop, Oozie, Hue, Flume, Kafka and Spark

NoSQL Databases: HBase, MongoDB& Cassandra

Java/J2EE Technologies: Servlets, JSP, JDBC, JSTL, EJB, JAXB, JAXP, JMS, JAX-RPC, JAX- WS

Web Tools: HTML, Java Script, XML, ODBC, JDBC, Hibernate, JSP, Java, Struts, spring, Junit, JSON and Avro

Programming Languages: Java, Python, SQL, PL/SQL, AWS, HiveQL, Unix Shell Scripting, Scala

Methodologies: Software Development Lifecycle (SDLC), Waterfall Model and Agile, STLC (Software Testing Life cycle) & UML, Design Patterns (Core Java and J2EE)

Database: Oracle 12c/11g, MYSQL, SQL Server 2016/2014

Web/ Application Servers: Weblogic, Tomcat, JBoss

Web Technologies: HTML4/5, CSS3/2, XML, JavaScript, JQuery, AJAX, WSDL, SOAP

Tools: and IDE: Eclipse, NetBeans, Maven, DB Visualizer, Visual Studio 2008, SQL Server Management Studio

PROFESSIONAL EXPERIENCE

Confidential - Boston, MA

Sr. Big Data/Hadoop Engineer

Responsibilities:

  • Developed Map Reduce programs in Python and Scala for processing the extracted customer data.
  • Designed and developed several cloud based Big Data Reporting and Analytics pipelines utilizing Spark, Kafka, AWS (EC2, EMR, ElasticSearch, Data Pipeline, S3, and Redshift), Splunk, Kibana, and MongoDb.
  • Worked extensively on ETL techniques in Data warehousing, Big data and traditional RDBMS platforms.
  • Revamp of the existing Database application to a Data Warehousing Application using Ralph Kimball's Data warehousing techniques. Responsible for legacy data collection (data warehousing) and stress testing this process
  • Expertise in Cloud, On-Premise and Hybrid deployments of Big Data solutions. Directed AWS based cloud strategies & solutions using EMR, Redshift, Kinesis, S3
  • Experienced on Hadoop/Hive on AWS, using both EMR and non-EMR-Hadoop in EC2.
  • Worked on distributed frameworks such as Apache Spark and Presto in Amazon EMR, Redshift and interact with data in other AWS data stores such as Amazon 53 and Amazon DynamoD
  • Created shellscripts for automating the process of extracting and loading targeted customers into Hive tables on daily basis to whom various email campaigns are sent.
  • Developed and deployed HiveUDF's written in Java for encrypting customer-id's, creating item-image-URL's etc.
  • Extracted StrongViewand Kafka logs from servers using Flume and extracted information like open/click info of customers and loaded into Hive tables.
  • Written Shell scripts for automation of Hivequery processing.
  • Scheduled Map-Reduce and Hiveworkflows using Oozie and Cron.
  • Used ApacheNiFi to generate graphical representation of data transferring and flow.
  • Developed HTML templates for various trigger campaigns.
  • Worked on OraclePL/SQL and Teradata some extract and loading data.
  • Analyzed transaction data and extracted category wise best-selling items info which is used by marketing team to come up with ideas for new campaigns.
  • Developed complex Hivequeries using Joins and automated these jobs using Shellscripts.
  • Developed Spark application for both batch process and also for streaming process.
  • Created Hive external tables with ORC, Bucketing & Transactional properties.
  • Monitored and debugged Map-Reduce jobs using the Job-tracker administration page.
  • Developed Storm Topologies for real time email campaigns where Kafka is used as source for getting customer's website activity information and storing data into Redis server.
  • Involved in migrating existing Hive jobs to SparkSQL environment.
  • Used SparkStreamingAPI for consuming data from Kafka source and processed data with core Spark functions written in Scala and then stored resultant data in HBase table which is later used for generating reports
  • Developed data pipeline to ingest data from Kafka source into HDFS as sink using Flume which is used for analysis.
  • Created a repository in GitHub (version control system) to store project and keep track of changes to files.
  • Used Eclipse Neon to develop spark application in PYDEV perspective.
  • Used IDLE (pythonGUI) for developing python code and in corporate in spark application.
  • Developed RESTwebservices for providing metadata information required for the campaigns.

Environment: MapReduce,Hive,HBASE,Python,Java,Storm,Scala,SparkStreaming,SparkSQL,Redis,Oozie,Kafka,Flume,REST webservices,Tableau, Teradata.

Confidential, Phoenix, AZ

Sr. Big Data/Hadoop Engineer

Responsibilities:

  • Developed multiple MapReduce programs for analyzing the insurance data of the customer and produce summary results from Hadoop to downstream systems.
  • Automated workflows using shell scripts which pulls developed code from GitHub into Hadoop.
  • Developed multiple MapReduce programs for analyzing the insurance data of the customer and produce summary results from Hadoop to downstream systems.
  • Develop/capture/document architectural best practices for building systems on AWS.
  • Involved in Installing, Configuring Hadoop Eco System and Cloudera Manager using CDH4 Distribution.
  • Created Hive queries to compare the raw data with EDW reference tables and performing aggregates
  • Developed Map-Reduce programs in Python.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
  • Built wrapper shell scripts to hold these Oozie work flow.
  • Wrote the shellscripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Integrating bulk data into Cassandra file system using MapReduce programs.
  • Installed and configured Hive and also written HiveUDFs.
  • Performed SQL Joins among Hive tables to get input for spark batch process.
  • Active member for developing streaming application using ApacheKafka.
  • Ingested all formats of structured and semi-structured data including relational databases, JSON using NIFI&Kafka into HDFS.
  • Used TOAD for Querying Hive tables for faster execution.
  • Involved in creating Hive tables, loading with data and writing hive queries using the HiveQL which will run internally in map reduce way.
  • Extracted the data from MySQL, AWSRedshift into HDFS using Sqoop.
  • Deployed Hadoop Cluster in Fully Distributed and Pseudo-distributed modes.
  • Involved in managing and monitoring Hadoop cluster using Cloudera Manager.
  • Involved in daily SCRUM meetings to discuss the development/progress and was active in making scrum meetings more productive
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig, Hive and Sqoop.
  • Unit tested a sample of raw data and improved performance and turned over to production.
  • Implemented AngularJs Controller functions, service using Controller methods to set up initial state of the object.

Environment: Hadoop, Map Reduce, ETL, Hive, HDFS, PIG,Scala,Agile, AngularJS, Python, Sqoop, Oozie, AWS, Cloudera, Flume, HBase, Zookeeper, CDH3, Oracle, NoSQL and Unix/Linux.

Confidential - Houston, TX

Sr. Java /Hadoop Developer

Responsibilities:

  • Worked on capacity planning and management of HDFS Clusters to meet the project needs.
  • Worked on custom Pig Loaders and storage classes to work with variety of data formats such as JSON and XML file formats.
  • Installed Oozieworkflow engine to run multiple Hive and Pig jobs, which run independently with time and data availability.
  • Developed data pipeline using Flume, Sqoop, Pig and JavaMapReduce to ingest customer behavioral data and financial histories into HDFS for analysis.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce and loaded data into HDFS.
  • Developed custom tags, JSTL to support custom user interfaces
  • Involved in designing the user interfaces using JSPs and Servlets.
  • Developed presentation layer using HTML, CSS and Java script.
  • Used EXT-JS framework for building interactive web applications using techniques such as Ajax, DHTML and DOM scripting.
  • Used XML wed services using SOAP to transfer the amount to transfer application that is remote and global to different financial institutions.
  • Involved in development of web services for business operations using various Web Services API and tools like SOAP, WSDL, JAX-WS, JDOM, XML and XSL.
  • Developed automated processes for flattening the upstream data from Cassandra which in JSON format. Used Hive UDFs to flatten the JSON Data.
  • Involved in developing PigUDFs for the needed functionality that is not out of the box available from ApachePig.
  • Involved in processing ingested raw data using MapReduce, ApachePig and Hive.
  • Involved in writing the hive queries for analysis on the structured data in the output folder of HDFS
  • Developed application using spring framework that leverages MVC (model view layer architecture).
  • Developed business domain layer using session and entity beans EJB.
  • Used JavaMessagingServices (JMS) for reliable and asynchronous exchange of important information such as payment status report.
  • Worked with a variety of issues involving multi-threading, server connectivity and user interface.
  • Made extensive use of java Naming and Directoryinterface (JNDI) for looking up of enterprise beans.
  • Developed SQL, PL/SQL, stored procedures - database application scripts.
  • Involved in Sprint meetings and followed agile software development methodologies.
  • Deployed the application on WeblogicApplicationServer.
  • Developed JUnit test cases for all the developed modules.

Environment: JDK1.7, Java, Hadoop, MapReduce, HDFS, Hive, Sqoop, HBase, Pig, Oozie, PL/SQL, JSP, PL/SQL, Spring, EJB, JMS, XML, UML, SOAP, Rationale Rose, eclipse, Web Logic, Hibernate, MS SQL Server

Confidential - Philadelphia, PA

Sr. Java/J2EE Developer

Responsibilities:

  • Implemented various J2EE Design Patterns such as Model-View-Controller (MVC), DataAccessObject, Business Delegate and Transfer Object
  • Involved in requirement gathering, functional and technical specifications.
  • Fixing the existing bugs in various releases.
  • Wrote requirements and detailed design documents, designed architecture for data collection.
  • Developed OMSA UI using MVC architecture, CoreJava, JavaCollections, JSP, JDBC, Servlets and XML within a Windows and UNIX environment.
  • Used Java Collection Classes like Array List, Vectors, HashMap and HashTable.
  • Created Hibernate OR mapping of the tables and integrated with spring (Transaction Management).
  • Used Maven for building the application, and completed testing by deploying on application server
  • Experienced building Restful(AJAX/JSON) applications.
  • Developed algorithms and coded programs in Java.
  • Designed various tables required for the project in Oracle10g database and involved in coding the SQLQueries, StoredProcedures and Triggers in the application.
  • Pushed the code to Jenkins and integrated the code with Maven.
  • Developed UnitTestCases.UsedJUNIT for unit testing of the application.
  • Involved in design and implementation using Core Java, Agile, Struts, and JMS.
  • Developed complex SSAS cubes with multiple fact measures groups, and multiple dimension hierarchies based on the OLAP reporting needs.
  • Performed all types of testing includes Unittesting, Integration and testing environments.
  • Used XML for Data presentation, Report generation and customer feedback documents.
  • Developed JUnit test cases for regression testing and integrated with Maven build.
  • Worked on a modifying an existing JMS messaging framework for increased loads and performance optimizations.
  • Used Combination of client and server side validation using Struts validation framework.
  • Use of Joins, Triggers, StoredProcedures and Functions in order to interact with backend database using SQL.
  • Coordinated with other Development teams, System managers and web master and developed good working environment

Environment: JAVA, Agile, JSP, JSON, HTML,, Eclipse IDEAjax, Design Patterns, struts,spring, Hibernate, Oracle, SQL/ PL SQL, JMS.

Confidential 

Java/J2EE Developer

Responsibilities:

  • Involved in development of business domain concepts into UseCases, Sequence Diagrams, Class Diagrams, Component Diagrams and Implementation Diagrams
  • Designed and developed WebServices using Java/J2EE in WebLogic environment.
  • Developed web pages using JavaServlet, JSP, CSS, JavaScript, DHTML, and HTML. Added extensive Struts validation. Wrote Ant scripts to build and deploy the application.
  • Involve in the Analysis, Design, and Development and Unit testing of business requirements.
  • Responsible for developing platform related logic and resource classes, controller classes to access the domain and service.
  • Developed UI Using HTML, CSS and developed BusinessLogic and Interfacing components using BusinessObjects, XML, and JDBC.
  • Implemented businesslogic and generated WSDL for those web services using SOAP.
  • Developed the application using SpringWebMVC framework.
  • Worked with Spring Configuration files to add new content to the website.
  • Worked on the SpringDAO module and ORM using Hibernate. Used Hibernate Template and Hibernate DAO Support for Spring-Hibernate Communication.
  • Configured Association Mappings such as one-one and one-many in Hibernate
  • Developed Servlets and JSP based on MVC pattern using Struts Action framework.
  • Involved in writing Hibernate queries and Hibernate specific configuration and mapping files.
  • Frequently wrote SQL on PL/SQL Developer to update and retrieve data from Oracle database
  • Developed unittestcases with JUnit
  • Used Log4J logging framework to write Log messages with various levels.
  • Involved in fixing bugs and minor enhancements for the front-end modules.
  • Coded various classes for Business Logic Implementation. Developed and deployed UI layer logics of sites using JSP.
  • Worked with StrutsMVC objects like Action Servlets, Controllers, and Validators, Web Application Context, Handler Mapping, Message Resource Bundles and JNDI for look-up for J2EE components.

Environment: J2EE, JDBC, Java, Servlets, JSP, Struts, Hibernate, Web services, SOAP, WSDL, Design Patterns, MVC, HTML, JavaScript, WebLogic, XML, JUnit, Oracle 9i, My Eclipse.

We'd love your feedback!