Java/j2ee Application Developer Resume
SUMMARY
- Holding around 8 + years IT experience out of which 4+ years on Big Data Developer, 4 years of experience on multiple technologies.
- Expertise skills in handling analytics projects using BigData technologies.
- Experience in different Hadoop distributions like Cloudera (CDH3 & CDH4) and Horton Works Distributions (HDP) and Elastic Map Reduce (EMR).
- Highly capable of processing large sets of Structured, Semi - structured and Unstructured datasets and supporting BigData applications.
- Hands on experience in creating Data Pipelines with ETL Tools NIFI and Kylo.
- Created Multiple templates in NIFI to Collect the data from source i.e.: Net Base, Twitter, Stats, DDS, PIR and Matrix.
- Hands on experience with Hadoop Ecosystem components like MapReduce (Processing), HDFS (Storage), YARN, Sqoop, Pig, Hive, HBase, Oozie and Zookeeper.
- Experience with NoSQL databases like HBase, MapR and Cassandra as well as other ecosystems like Zookeeper, Oozie, Impala, Storm, Spark- Streaming/SQL, Kafka, Hypertable, Flume.
- Expertise in transferring data between a Hadoop ecosystem and structured data storage in a RDBMS such as MY SQL, Oracle, Teradata and DB2 using Sqoop.
- Used SparkAPI over Cloudera Hadoop YARN to perform analytics on data in Hive.
- Expertise in moving large amounts of log, streaming event data and Transactional data using Flume.
- Hands on experience developing workflows that execute MapReduce, Sqoop, Pig, Hive and Shell scripts using Oozie.
- Experience working with Cloudera Hue Interface and Impala.
- Installed, configured Talend ETL on single and multi-server environments.
- Created standard and best practices for Talend ETL components and jobs.
- Expertise in working with transactional databases like Oracle, SQL server, My SQL, and Db2.
- Expertise in developing SQL queries, Stored Procedures.
- Have worked in Shell scripting, Python scripting language in LINUX/UNIX platforms as a developer.
- Fluent with the core Java concepts like I/O, Multi-threading, Exceptions, RegEx, Collections, Data-structures and Serialization.
- Good Working knowledge on Cassandra and MangoDB 3.2
- Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
- Having experience on Eclipse, NetBeans IDEs.
- Experienced in web designing technologies like XML, HTML, DHTML and JavaScript.
- Ability to adapt to evolving technology, strong sense of responsibility and accomplishment.
- Excellent leadership, interpersonal, problem solving and time management skills.
- Excellent communication skills both written (documentation) and verbal (presentation).
- Very responsible and good team player. Can work independently with minimal supervision.
- Has very good development experience with Agile Methodology.
- Strong experience in distinct phases of Software Development Life cycle (SDLC) including Planning, Design, Development and Testing during the development of software applications.
TECHNICAL SKILLS
Programming Languages: C, C++, Java (core), J2EE, Asp.Net, Python, Scala, UNIX Shell Scripting.
Web Languages: HTML, PHP, JAVA SCRIPT, CSS.
Hadoop Ecosystem: MapReduce, HBASE, HIVE, PIG, SQOOP, Zookeeper, OOZIE, Flume, HUE, Kafka, AWS EMR.
Spark: SPARK-CORE, SPARK-GRAPH, SPARK- STREAMING, SPARK-SQL, SPARK-MLlib.
Hadoop Distributions: Cloudera, Hortonworks, MapR.
Database Languages: MySQL, NOSQL, MangoDB.
Database: Oracle DB, Cassandra, DynamoDB.
Virtualization & Cloud Tools: Amazon AWS, VMware, Virtual Box.
Web/Application Servers: Apache Tomcat.
Version Control Tools: SVN, Git.
Operating Systems: Ms-Dos, Windows, Linux (Ubuntu, Red Hat, Cent OS).
Designing Language: Rational Rose2003.
IDE Platforms: Eclipse, PyCharm, Net Beans, Visual Studio.
Methodologies: Agile, SDLC.
Tools: Putty, Talend Open Studio, NIFI and KYLO.
Build Tools: Maven.
PROFESSIONAL EXPERIENCE
Confidential
BigData Pipeline Developer
Responsibilities:
- Hands on experience in creating Data Pipelines with ETL Tools NIFI and Kylo.
- Involved in creating architecture of Data Pipelines for APT Front-End Tool at Coca-Cola.
- Involved in making High Level APT Logical Data Flow.
- Created Multiple templates in NIFI to Collect the data from source i.e.: Net Base, Twitter, Stats, DDS, PIR and Matrix.
- Experienced in loading data from s3 to Hive Using NIFI.
- Created SQS Events and Queue to Configuration Kylo Feed to read the data from CSV files in S3.
- Create templates “APT Ingest to Data Lake template” listens to a S3 SQS Event queue for notification of the file creations.
- Worked on Pipeline template to populate a Hive table in the Data Lake from an S3 file.
- Managed different data sources internally generated by coke employees and manual sources i.e.: E-Poll, Scarborough, Competitive Sponsors, Simmons, Ticket Manager, College attendance, College Enrollment, College Fanbase, Investment, Marketing Rights and Fans by states.
- Word on NIFI processor to identify S3 bucket and Filename from the SQS’s JSON object.
- Developed transformations when applicable, are performed into a temp table prior to loading to the final table.
- Optimized Java code of DynamoDBWritter and DynamoDBReader Processors reads the AVRO data from the flow file and parses based on Avro schema.
- Written HQL Queries to create external table by using SerDEs with Different compression formats i.e. ORC SerDE, Avro SerDE and JSON SerDE.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Created Hive Queries in hive using Complex Data Types and Date data types in Hive.
- Data loading involved creating Hive tables and partitions based on the requirement.
- Developed Shell / Python scripts to generate Behavioral data reports.
- Written Shell scripts to log errors while collecting data from external sources using NIFI and Kylo, data performance and written Cron Jobs to schedule the Kylo and NIFI Jobs.
- Worked with Think Big Team to improve the NIFI and KYLO Performance.
- Developed manual Data Pipelines used by APT front-end tool.
- Created Back Feed to read the data from DynamoDB to Hive tables on Cron based schedule.
- Created View in Hive, and data from the views is pushed into DynamoDB on weekly based.
- Expertise in Hive Query Language and debugging hive issues.
- Developed Matrix Pipeline and managed all the Coke related pipelines.
- Trained offshore team to maintain the Data Pipeline with clear documentation on them.
- Worked on different file formats (ORCFILE, TEXTFILE) and different Compression Codecs (GZIP, SNAPPY, LZO).
- Worked with various Hadoop file formats, including Text, Sequence File and RC File.
Environment: HDFS, MapReduce, NIFI, Kylo, AWS S3, SQS, DynamoDB, Spark, Pig, Hive, Flume, Sqoop, Kafka, Java, UNIX Shell Scripting, R- Studio and MySQL.
Confidential
Hadoop/Spark Developer
Responsibilities:
- Involved in various stages of this project like planning, estimation the hardware and software, installing Hadoop Eco System Components. Involved in Hadoop Cluster Administration that includes adding and removing cluster nodes, cluster capacity planning, performance tuning.
- Building Real Time Processing using Storm Topologies of Telecom Subscriber Behavior Data using Kafka, Storm Spouts, Bolts and Trident topologies.
- Developed HBase Bolts/HDFS Bolts in Storm topologies.
- Worked on Hadoop clusters capacity planning and management.
- Monitoring and Debugging Hadoop jobs/storm Applications running in production.
- Implemented Kafka producer jobs to load data from various sources to Kafka.
- Written a PIG scripts to read data from HDFS and write into Hive table.
- Experience of performance tuning Hive scripts, Pig scripts, MR jobs in production environment by altering job parameters
- Experienced in managing andreviewingHadooplog files.
- Providing various hourly/weekly/monthly aggregation reports required by clients through spark.
- Break down large system requirementsinto manageable parts.
- Used MapReduce to Index the large amount of data to easily access specific records.
- Automation of data pulls from SQL Server toHadoopeco system via SQOOP.
- Worked on data processing part mainly to make the unstructured data to semi-structured data and loaded into Hive tables, HBase tables and integration.
- Load log data into HDFS using Flume. Worked extensively in creating MapReduce jobs to power data for search and aggregation.
- Written the Apache PIG scripts to process the HDFS data
- Written the Map Reduce Code for filtering data.
- Created Hive tables to store the processed results in a tabular format.
- Developed the Sqoop scripts to make the interaction between Pig and Oracle.
- Writing the script files for processing data and loading to HDFS.
- Worked extensively with Sqoop for importing data from Oracle.
- Utilized Apache Hadoop ecosystem tools like HDFS, Hive and Pig for large datasets analysis.
- Developed Pig and Hive UDF's to analyze the complex data to find specific user behavior.
- Experienced in using Pig for data cleansing and developed Pig Latin scripts to extract the data from webserver output files to load into HDFS.
- Excellent experience working on tHDFSInput, tHDFSOutput, tPigLoad, tPigFilterRow, tPigFilterColumn, tPigStoreResult, tHiveLoad, tHiveInput, tHbaseInput, tHbaseOutput, tSqoopImport, and tSqoopExport.
- Worked on Hive by creating external and internal tables, loading it with data and writing Hive queries.
- Created HBase tables to store data from different sources.
- Developed workflow in Oozie to automate the tasks of loading data into HDFS and pre-processing with Pig and Hive.
- Worked on different file formats (ORCFILE, TEXTFILE) and different Compression Codecs (GZIP, SNAPPY, LZO).
- Worked with various Hadoop file formats, including Text, Sequence File and RC File.
- Configured Zookeeper for Cluster co-ordination services.
Environment: HDFS, MapReduce, Spark, Pig, Hive, HBase, Pig, Flume, Sqoop, Flume, Kafka, Java, UNIX Shell Scripting and MySQL.
Confidential
Hadoop Developer
Responsibilities:
- Responsible for understanding the scope of the project and requirements gathering.
- Used MapReduce to Index the large amount of data to easily access specific records.
- Load log data into HDFS using Flume. Worked extensively in creating MapReduce jobs to power data for search and aggregation.
- Written the Apache PIG scripts to process the HDFS data.
- Written the Map Reduce Code for filtering data.
- Created Hive tables to store the processed results in a tabular format.
- Developed the Sqoop scripts to make the interaction between Pig and Oracle.
- Writing the script files for processing data and loading to HDFS.
- Worked extensively with Sqoop for importing data from Oracle.
- Utilized Apache Hadoop ecosystem tools like HDFS, Hive and Pig for large datasets analysis.
- Developed Pig and Hive UDF's to analyze the complex data to find specific user behavior.
- Experienced in using Pig for data cleansing and developed Pig Latin scripts to extract the data from web server output files to load into HDFS.
- Developed MapReduce ETL in Java/Pig and data validation using HIVE.
- Worked on Hive by creating external and internal tables, loading it with data and writing Hive queries.
- Created HBase tables to store data from various sources.
- Developed workflow in Oozie to automate the tasks of loading data into HDFS and pre-processing with Pig and Hive.
- Worked with various Hadoop file formats, including Text, Sequence File, RCFILE and ORC File.
- Configured Zookeeper for Cluster co-ordination services.
Environment: Hadoop, MapReduce, HDFS, Pig, Hive, HBase, Zookeeper, Flume, Kafka, Spark, Elastic Search, Oozie, Impala, Java(jdk1.6), Cloudera, Oracle 11g/10g, Windows, UNIX Shell Scripting.
Confidential
Hadoop Developer
Responsibilities:
- Importing data from Oracle to Hadoop through Sqoop and loading the same into Hive.
- Designed the control tables / Job tables in HBase and MySQL. Created external Hive tables on HBase.
- Automation of data pulls from SQL Server toHadoopeco system via SQOOP.
- Worked on data processing part mainly to make the unstructured data to semi-structured data and loaded into Hive tables, HBase tables and integration.
- Working on Hive, Impala tables (Bucketing & Partitioning)
- Worked on MapReduce and testing the MapReduce jobs with the test environment using test file
- Experience in developing batch-processing framework to ingest data into HDFS, Hive and HBase.
- Worked on Hive and Pig extensively to analyze network data.
- Designed & implemented HBase tables, Hive UDFs & Sqoop the data with complete ownership.
- Worked collaboratively with different teams to smoothly slide the project to production.
- Built the process automation of various jobs using OOZIE.
- Worked extensively on performance tuning in HIVE.
- Used Apache Kafka for importing real time network log data into HDFS.
- Deployed and configured Flume agents to stream log events into HDFS for analysis.
- Load the data into Hive tables using Hive HQL's along with deduplication and Windowing.
- Worked on HCatalog, which allows PIG and Map Reduce to take advantage of the SerDE data format transformation definitions that we write for HIVE.
- Worked on different file formats (ORCFILE, TEXTFILE) and different Compression Codecs (GZIP, SNAPPY, LZO).
- Worked with multiple Input Formats such as Text File, Key Value, Sequence File input format.
- Working with Java Development teams in the data parsing.
- Involved in developing Multi-Threading environment to improve the performance of merging operations.
- Involved in writing a java program to add or remove headers from the file.
Environment: Hadoop, MapReduce, HDFS, Pig, Hive, HBase, Zookeeper, Flume, Kafka, Spark, Elastic Search, Oozie, Impala, Java(jdk1.6), Cloudera, Oracle 11g/10g, Windows, UNIX Shell Scripting.
Confidential
Java/J2ee Application Developer
Responsibilities:
- Developing the code as per the requirement.
- Involved in designing and developing the User Interface.
- Designed and developed the Business Layer in application using Java.
- Creating tables, views and procedures in SQL server 2008.
- Helping my team members for any technical issues.
- Analyze the Functional Requirement Document and create Test Case Template.
- Perform end-to-end testing in development and integration.
- Ensure that the unit testing performed for the same and the unit test documents are prepared.
- Utilized Agile Methodologies to manage full life-cycle development of the project.
- Implemented MVC design pattern using Struts Framework.
- Developed web application using JSP custom tag libraries, Struts Action classes and Action.
- Designed Java Servlets and Objects using J2EE standards.
- Used Oracle 10g Database for data persistence.
- SQL Developer used as a database client.
- Performed Test Driven Development (TDD) usingJUnit.
- Developed the XML Schema and Web services for the data maintenance and structures.
- Designed database and created tables, written the complex SQL Queries and stored procedures as per the requirements.
Environment: Java, J2EE, Servlets, JSP, Struts 1.2 UML, Rational Rose, Oracle 9i, WebSphere 6, RAD 6, HTML, AJAX, JavaScript, JUnit, ANT, Web services, XML SAX.
Confidential
Unix/Python Developer
Responsibilities:
- Participated in the complete SDLC process.
- Developed Business logic using Python 2.7.
- Worked in all the phases of the Application like Requirements gathering from client, Analyzing the project flow, Attending the client calls, Development, Deployment, Delivery, Documentation and Support.
- Interact with functional SMEs to understand the requirement impact on the system.
- Developed and maintained various automated web tools for reducing manual effort and increasing efficiency of the Global Trading Team.
- Developing Python Scripts to parse XML documents.
- Customer interaction for understanding the API documents and integrating the tool functionality in the platform.
- Developed Python scripts to format and create daily transmission files.
- Created shell scripts for various tasks.
- Design and develop solutions using C, C++, Multi-Threaded, Shell Scripting.
- Develop Unit Test Plan for UNIX.
- Develop Code -C, UNIX Control.
- Data set up using SQL/ORACLE/Teradata.
- Involved in design analysis, creating design documents planning and managing day-to-day activities and interaction with team members to facilitate smooth implementation.
Environment: Python 2.7, PyCharm, .NET, PyQuery, MVW, HTML5, Shell Scripting, JSON, Apache Web Server, SQL, UNIX, Windows, MySQL.