Bigdata/hadoop Developer Resume
SUMMARY:
- Around 13.5 years of progressive experience in the IT industry wif proven expertise in Analysis, Design, Implementation and Testing of software applications using Big Data Technologies and Java based technologies.
- 5 years of hands - on experience wif Hadoopcore and Eco-System components including Spark, HDFS, Map Reduce, Hive and Sqoop
- Developed ETL pipelines using shell scripting to extract data fromMySQL, Teradatadatabases into HDFS using Sqoop and loading into the Hive tables.
- Hands-on-experience wif Spark in converting the HQL to Spark SQL and Data Frames to load the data wif different formats and perform transformations like filter, adding new columns and aggregate
- Experience working wif Horton works distribution and Cloudera Hadoop distribution.
- Experience in Microsoft Azure date storage and worked inAzure Data LakeStore (ADLS)
- Expertise in working wif Hive data warehouse tool -creating tables, data distribution by implementing partitioning and bucketing, writing, and optimizing the HiveQL queries.
- Involved in using various python libraries wif Pyspark in order to create data frames and store them to Hive.
- Automated all the jobs for extracting the data from different Data Sources like MYSQL, TERADATA, TUMBLEWEED to Hadoop Distributed File System, loading HIVE tables and SAP HANA tables.
- Developed ETL pipeline using Datastage to process the source file including the data cleansing and transformation.
- Connected BigQuery using the Bq Connector from Datastage and loaded the data into the BigQuery tables
- Hands on experience in setting up workflow using CA7 Scheduler for managing and scheduling Hadoop jobs by creating KPARM.
- Good noledge on various scripting languages like Linux/Unix shell scripting and Python.
- Experience working wif Build tools like Maven and SBT.
- Analyzed the sql scripts and designed it by using PySpark SQL for faster performance.
- Experience in working wif Databases like oracle, MySQL, Teradata, and SAP HANA
- Experienced and skilled Agile Developer wif a strong record of excellent teamwork and successful coding.
- Strong Problem Solving and Analytical skills and abilities to make Balanced & Independent Decisions.
TECHNICAL SKILLS:
Big Data /Hadoop Ecosystem: Hadoop, HDFS, MapReduce, Hive, Spark, Sqoop,Azure, GCP, BigQuery, Airflow, Datastage
Programming Languages: JAVA, Python
Hadoop Distributions: Apache Hadoop, Cloudera Distribution CDH3 and Horton works Data Platform (HDP)
Query Languages: HiveQL, SQL
Web Technologies: Java, J2EE, Struts, Spring, JSP, Servlet, JDBC, JavaScript
IDE s: Eclipse, RAD, WSAD
Frameworks: MVC, Structs, Spring, Hibernate
Build Tools: Ant, Maven
Databases: Oracle, MYSQL, Teradata, SAP HANA
Operating systems: Windows, Linux, Unix
Scripting Languages: Shell scripting
Version Control system: SVN, GIT, Rational ClearCase
PROFESSIONAL EXPERIENCE:
Confidential
Bigdata/Hadoop developer
Responsibilities:
- Participate and understand the business requirement, Dimension and Fact measures.
- Design, develop, validate and deploy theETL processes using Hadoop.
- Understand the mapping rules, data formats, and detailed logic for each of the target measures wifin o9 platform.
- Build data pipeline for different events of ingestion, aggregation and load datainto Hive external tables in HDFS location.
- Develop Pyspark to read the input parquet file, process, transform and load into the Hive tables.
- Develop shell script to extract the source data from Azure Data Lake and store it in the HDFS.
- Develop Hive Scripts to apply transformations and aggregations on data and load into the tenant schema
Environment: Hadoop, HDFS, Azure Data Lake, Spark, Hive, JIRA (for Agile), GIT, ServiceNow, Shell script, Airflow.
Confidential
Spark/Hadoop developer
Responsibilities:
- Participate in the development of technical requirements and design specificationas appropriate and developing the software as required.
- Design, develop, validate and deploy theETL processes using Hadoop.
- Build data pipeline for different events of ingestion, aggregation and load datainto Hive external tables in HDFS location to serve as feed for several dashboards and Web APIs.
- Develop SQOOP scripts to migrate data from MySQL to Big data Environment.
- Developed ETL pipeline using Datastage to process the source file including the data cleansing and transformation.
- Connected BigQuery using the Bq Connector from Datastage and loaded the data into the BigQuery tables
- Migrating HIVE workload to Spark SQL.
- Worked on Spark DataFrames API to load the file wif different formats like csv, text
- Worked on DataFrames transformations like filter, adding new columns, aggregate to perform data analysis
- Work wif different file formats like CSV, text and compression techniques like snappy according to the request of the client.
- Develop Shell Scripts to automate the Jobs before moving to Production in a configured way by passing Parameters.
- Historical and Incremental Load to the Target Tables using Shell Scripting
- Develop stage tables, stored procedures and configuration file to load the data into the SAP HANA.
- CRQ process for code migration to Production environment and to schedule the jobs through CA7 using ServiceNow
- Creating Job KPARM in Mainframe for different kinds of Data load for Production
- Schedule automated jobs on daily basis and weekly basis according to the requirement using CA7 as Scheduler.
- Work on operation controls like job failure notifications, email notifications for failure logs and exceptions.
- Support the project team for successful delivery of the client's business requirementsthrough all the phases of the implementation.
Environment: Hadoop, HDFS, Spark, Sqoop, Hive, Datastage, GCP, BigQuery, MySQL, JIRA (for Agile), Teradata, GIT, ServiceNow, SAP HANA, CA7, Shell script.
Confidential, Texas
Java Developer
Responsibilities:
- Gather system requirements, perform design and analysis, coding, unit testing of core system functionality and corrects defects during various phases of testing
- Coded Java Server Pages for the Dynamic front end content that use Servlets
- Designed and developed Data Access Objects (DAO) to access the database.
- Used DAO Factory and value object design patterns to organize and integrate the JAVA Objects.
- Used JDBC API to connect to the database and carry out database operations.
- Performed unit testing, system testing and integration testing
- Prepare metrics and provide Project status presentation to the Senior Management review
Environment: Java,Servelets,Webservices,JAX-WS,JAXB,DB2
Confidential
Java Developer
Responsibilities:
- Developed the Low-level design document
- Created SOAP and REST API Services to connect to different source system to extract the document
- Involved in extracting the XML document from Mark Logic database
- Performed unit testing, system testing and integration testing
- Constant interaction wif clients during requirements analysis and development phase
- Maintain the application by resolving the bugs and enhancements.
- Participated in Weekly Development and User interaction calls
Environment: Java,XML,RESTful,Webservices, Mark Logic.
Confidential, New York
Java Developer
Responsibilities:
- Developed the dashboard using JSP and Java Script
- Developed the Low-level design document
- Created workflow and monitored the progress using Spring framework
- Used JDBC API to connect to the database and carry out database operations.
- Developing Data Enrichment components for converting data from the Database to XML
- Performed unit testing, system testing and integration testing
Environment: Java,Spring,Java Script.
Confidential
Java Developer
Responsibilities:
- Involved in the implementation of web tier using Servlets and JSP.
- Developed the user interface using JSP and Java Script to view day-to-day activities
- Coded HTML pages using CSS for static content generation wif JavaScript for validations.
- Used JSP and JSTL Tag Libraries for developing User Interface component
- Performed unit testing, system testing and integration testing.
Environment: Java, JSP, Servlets, Struts, JavaScript, XML, Oracle.