Hadoop Developer Resume Boston, MA - Hire IT People

SUMMARY:

A versatile Software Developer with an experience of over 8+ years with 4 years extensively in Hadoop along with 4+ years of experience in Java/J2EE enterprise application design, development and maintenance.
Strong experience in Big Data & projects in multiple domains, tools in all phases of SDLC: Requirements gathering, System Design, Development, Enhancement, Maintenance, Testing, Deployment, Production support, System.
Strong experience in Big Data and Hadoop Ecosystem tools like MR, PIG, HIVE, SQOOP, OOZIE, FLUME, HBASE, Kafka and SPARK.
Strong experience in configuring and using Apache Hadoop ecosystem components like Hadoop MapReduce, HDFS, HBase, Oozie, Hive, Sqoop, Pig, Kafka, SPARK and Flume.
Experience in developing customized UDF’s in java to extend Hive and Pig Latin functionality.
Good understanding of HDFS Designs, Daemons, federation and HDFS high availability (HA).
Good understanding on Spark core, Spark SQL and Spark Streaming.
Good Knowledge of UNIX and shell scripting
Knowledge of NoSQL databases such as HBase, and DynamoDB.
Technical expertise in EJB, JBOSS, Restful Web services, Maven, JUnit and Arquillian Integration Framework technologies.
Technical expertise in GCC compiler, GDB, Wireshark, TCP/UDP Sockets technologies.
Experience on IDE’s namely Eclipse, IntelliJ and Visual Studio
Experience on version control tools like Rational Clearcase, GIT, Visual Source Safe and SVN
Experience in working with relational databases Postgres and SQL programming.
Experienced in Agile methodology as a Subject Matter Expert and Technical Coordinator for work effort estimation, allocation and Technical Docs.
Possess superior design and debugging capabilities, innovative problem solving and excellent analytical Skills.
Involved in Process improvement activities that reduce operational costs.
Created multiple tools that automate repetitive manual tasks and helps reduce effort by up to 50%.
Experienced in Quality Assurance activities including peer reviews, casual analysis for defects and creation of checklists for improving deliverable quality.
Focused on Quality and processes. Excellent written and verbal communication skills and team player.
Have flair to adapt to new software applications and products, self - starter, have excellent communication skills and good understanding of business work flow.

TECHNICAL SKILLS:

Big Data Technologies: Apache Spark, HDFS, Yarn, Hive, MapReduce, Pig,Sqoop, Flume, Oozie, Kafka,

Programming Languages: Scala, Core Java, EJB, C/C++

RDBMS: Postgres, MySQL

NoSQL Databases: HBase, DynamoDB

Operating Systems: Linux (CentOS and SUSE), HP-Itanium and Windows

Special tools: Maven, Autosys, GDB, Wireshark, Make.

Version Control: SVN, GIT, Clearcase, Visual Source Safe

PROFESSIONAL EXPERIENCE:

Confidential, Boston, MA

Hadoop Developer

Responsibilities:

Developed Sqoop job to pull PRDS (party reference data) to the HDFS location from Teradata
Prepared xmls for each source system like ATM, Loans, Teller etc to validate each record from HDFS source file and these xmls are validated by XSD.
Files types Delimited, Position Based and Binary files are loaded in to SparkContext and validated against xml.
Implemented Repartition, Caching and broadcast concepts on RDD’s, DF’s and variables to achieve better performance on cluster.
Create parquet files for valid records and invalid records separately for all systems.
Storing the parquet data into hive data base with daily date partitions for further queries.
The validated parquet files of two or more systems got combined in curation module to get the common transactions data.
Data Frames are created by reading the validated Parquet Files and run the SQL queries using SQLContext to get the common transaction data from all the systems.
Developed Spark jobs using Scala in test environment for faster data processing and used Spark SQl for
querying.
Involved in working with Spark on top of Yarn/MRv2 for interactive and Batch Analysis.
Executed Oozie workflows to run multiple Hive and Pig jobs.

Environment: JDK1.8, Apache Spark 1.6, Scala 2.10, Sqoop, Oozie, Hive, AutoSys, Yarn cluster, Cloudera Distribution, Intellij IDE, Maven.

Confidential, Warren, NJ

Hadoop/ Spark Developer

Responsibilities:

Worked towards designing and deployment of Hadoop cluster and different Big Data analytic tools including Pig, HIVE, HBase, Oozie, ZooKeeper, SQOOP, and spark.
Creating end to end Spark applications to perform various data cleansing, validation, transformation and summarization activities on user behavioral data.
Developed Scala scripts, UDF's using both Data frames/SQL and RDD/MapReduce in Spark for Data Aggregation, queries and writing data back into RDBMS through SQOOP.
Worked on improving the performance and optimization of the existing algorithms in Hadoop using Spark context, Spark-SQL, Data Frames, RDD's, Spark YARN.
Responsible for Developing Data Pipeline to load data from sources such as IBM Mainframes and SQL Server using SQOOP along with Kafka and Spark Streaming & Processing Frameworks as per the requirements.
Imported data from Kafka Consumer group into Apache Spark through Spark Streaming APIs.
Worked towards Real-time Streaming of data using Spark with Kafka.
Performed Advanced analytics, feature selection/extraction using Apache Spark (Machine Learning & Streaminglibraries) in Scala.
Worked extensively with importing metadata into Hive using Scala and migrated existing tables and applications to work on Hive.
Transferred Data from Legacy Systems to HDFS and HBase using SQOOP.
Loading data into HBase using Bulk Load and Non-bulk load.
Worked towards presenting the analyzed data inform of reports using Tableau.
Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
Responsible for developing Data Pipeline using SQOOP and Pig to extract data from weblogs and store in HDFS.
Extensively worked on Hive for generating transforming files from different analytical formats to .txt i.e. text files enabling to view the data for further analysis.
Responsible for developing PIG Latin scripts enabling the extraction of data from the web server output files to load into HDFS.
Responsible for implementing schedulers on Job Tracker enabling them to effectively use the resources available in the cluster for any given MapReduce jobs.
Gained extensive knowledge and exposure onPySpark and various Spark API's.
DevelopedPOC's using Scala, Spark SQL and MLlib libraries along with Kafka and other tools as per requirement then deployed on the Yarn cluster.
Developed POC to configure and Install Apache Hadoop in AWS EC2 System. Further, Casandra Cluster was deployed in Amazon AWS environment with high level of scalability as per requirements.
Involved in story-driven agile development methodology and actively participated in daily scrum meetings. Environment: Hadoop, YARN, AWS, Java SE 7, Java, Scala, Python, Spark , Spark -SQL, Spark MLlib, MapReduce, HDFS, HBase, HIVE, Pig, Kafka, Storm, Flume, Cassandra, Oozie, ZooKeeper, Cloudera- CDH4/5 Distribution, and SQL Server.

Confidential, Newrk, NJ

Hadoop Developer

Responsibilities:

Identified the key areas of the solution and parallelized the data loads/processing.
Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest data into HDFS for analysis.
Worked in a team for planning, designing and building the end to end solution which enables user driven analytics on top of the data residing in hive.
Responsible for managing and scheduling jobs on Hadoop Cluster.
Hands on experience in database performance tuning and data modeling.
Actively involved in code review and bug fixing for improving the performance.
Used IMPALA to process the data from Hive tables.
Experience in developing Map Reduce Job to transform data and store into HBase, Impala.
Involved in loading data from UNIX file system to HDFS.
Developed Pig Latin Scripts to extract data from log files and store them to HDFS.
Created User Defined Functions (UDFs) to pre-process data for analysis.
Developed PIG UDFs for manipulating the data according to Business Requirements and also worked on developing custom PIG Loaders.
Run various Hive queries on the data dumps and generate aggregated datasets for downstream systems for further analysis.
Performed research on Hive to analyze the partitioned and bucketed data and compute various metrics to determine the performance on Hadoop cluster.
Used Sqoop to import and export data from Rdbms, Teradata to Hdfs & vice versa.
Experience in managing Hadoop log files and handling data manipulation using python Scripts.
Implemented Fair Scheduler on the job tracker to allocate the fair amount of resources to small jobs.
Implemented automatic failover Zookeeper and zookeeper failover controller.
Designed ETL flow for several Hadoop Applications.
Created Talend ETL jobs for data transformation, data sourcing and mapping.
Developed Oozie workflows and used Oozie operational services for batch processing and scheduling the workflows dynamically.
Implemented CRUD operations on HBase data using thrift API to get real time insights.
Used Flume to collect, aggregate, and store the log data from different web servers.
Experience With installing and configuring Distributed Messaging System like Kafka.
Very good understanding and knowledge of assigning number of mappers and reducers to Map reduce cluster.
Implemented a Product Recommendation Service using Mahout.

ENVIRONMENT: Hadoop, MapReduce, Hive, HDFS, PIG, Sqoop, Flume, HBase, Spark , Zookeeper, AWS, SQL Server, Teradata, Talend, Autosys, MYSQL, Impala, Python, UNIX, Tortoise Git.

Confidential, Chattanooga, TN

Java/ J2EE Developer

Responsibilities:

Effective role in the team by interacting with welfare business analyst/program specialists and transformed business requirements into System Requirements
Involved in developing the application using Java/J2EE platform. Implemented the Model View Control (MVC) structure using Struts.
Responsible to enhance the Portal UI using Html,Java Script,XML,JSP,Java, CSS as per the requirements and providing the client-side Java script validations and Server side Bean Validation Framework (JSR 303).
Developed Web services component using XML,WSDL and SOAP with DOM parser to transfer and transform data between applications.
Developed analysis level documentation such as USECASES,BUSSINESS DOMAIN MODEL, Activity,Sequence and Class Domain.
Handling of design reviews and technical reviews with other project stakeholders.
Implemented services using Core Java.
Developed and deployed UI layer logics of sites using JSP.
Spring MVC forthe implementation of business model logic.
Used SOAP UI for testing the Restful Webservices by sending an SOAP request.
Used AJAX framework for server communication and seamless user experience.
Created test framework on Selenium and executed Web testing in Chrome, IE and Mozilla through Web driver.
Worked with StrutsMVC objects like action Servlet, controllers, and validators, web application context, Handler Mapping, message resource bundles, and JNDIfor look-up for J2EE components.
Developed dynamic JSP pages with Struts.
Employed built-in/custom interceptors, and validators of Struts.
Developed the XML data object to generate the PDF documents, and reports.
Employed Hibernate, DAO, and JDBC for data retrieval and medications from database.
Messaging and interaction of web services is done using SOAP.
Developed Junittest cases for Unit Test cases and as well as system, and user test scenarios

Environment: Struts, Hibernate, Spring MVC, SOAP, WSDL, Web Logic, Java, JDBC, Java Script, Servlets, JSP, JUnit, XML, UML, Eclipse, Windows.

Confidential

Jr. Java Developer

Responsibilities:

Involved in designing the Project Structure, System Design and every phase in the project.
Responsible for developing platform related logic and resource classes, controller classes to access the domain and service classes.
Developed UI using HTML, JavaScript, and JSP, and developed Business Logic and Interfacing components using Business Objects, XML, and JDBC.
Designed user-interface and checking validations using JavaScript.
Managed connectivity using JDBC for querying/inserting & data management including triggers and stored procedures.
Involved in Technical Discussions, Design, and Workflow.
Participate in the Requirement Gathering and Analysis.
Developed Unit Testing cases using Junit Framework.
Implemented the data access using Hibernate and wrote the domain classes to generate the Database Tables.
Involved in design of JSP’s and Servelets for navigation among the modules.
Designed cascading style sheets and XML part of Order entry Module & Product Search Module and did client side validations with java script.
Involved in implementation of view pages based on XML attributes using normal Java classes.
Involved in integration of APP Builder and UI modules with the platform.

Environment: Hibernate, Java, JAXB, JUnit, XML, UML, Oracle11g, Eclipse, Windows XP.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Boston, MA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship