Big Data Analyst Resume
ClevelanD
PROFESSIONAL SUMMARY:
- Over 8 years of experience in Analysis, Design, Development, Testing, Implementation, Maintenance and Enhancements on various IT Projects. Around 3+ years of experience in Big Data in implementing end - to-end Hadoop solutions.
- Strong basics and understanding of HDFS and the complete Hadoop Eco-system.
- Strong experience and knowledge in Hadoop Ecosystem including MapReduce, Hive, Pig, Hbase, Hue, Sqoop, Oozie and Scala.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and vice-versa.
- Proficient in automating the data workflow scheduling using Oozieand Shell scripting, large scale data ingestions using SQOOP and ETL tasks using PIG.
- Extensive knowledge in using SQL Queries for backend database analysis.
- Developed Informatica ETL mappings and work flows to move the daily generated data from DB2 source systems to RDBMS and HDFS using different transformations like Source Qualifier, Expression and Lookup.
- Experience in Extracting, Transforming and Loading (ETL) process from the client input text files using SQL server.
- Created, tested, performance tuned, and implemented complex stored procedures using Microsoft SQL Server 2012.
- Developed custom processes for clients based on their specific requirements utilizing SQL Server.
- Designed quality assurance SQL scripts for testing the ETL process.
- Expertise in Core Java, J2EE, Multithreading, JDBC and proficient in using Java API’s for application development.
- Worked on different operating systems like UNIX/Linux, Windows XP, and Windows 2K
TECHNICAL SKILLS:
Big Data Ecosystems: HDFS, Map Reduce, HBase, Hive, Pig, Sqoop, Oozie, Scala Basics
J2EE Technologies: Servlets, JSP, JDBC
Languages: Java, SQL, C#, C, C++, HTML, Java script, XML
ETL Tools: Informatica 9.5.1 - Power Center
Concepts: Core DBMS, Data mining
Tools: DBVisualizer, DQ Analyzer
Operating Systems: UNIX/Linux, Windows 98, 2000, XP, Vista, Windows 7
Office Tools: MS Word - Powerpoint- Excel - SSIS (Basics)
Servers: Tomcat
PROFESSIONAL EXPERIENCE:
Confidential, Cleveland
Big Data Analyst
Responsibilities:
- Interacted with the client directly to figure out the requirements of the project.
- Used SQOOP to perform continuous data ingestions from SQL and other source systems.
- Implemented the data transformations using Hive and Pig.
- Used Ooziework flows to schedule the shell, hive, pig actions.
- Worked on error handling in shell scripting.
- Involved in loading data from RDBMS into HDFS using Sqoop queries.
- Handled Delta processing or incremental updates using hive and processed the data in hive tables
- Worked on Hive optimization techniques to improve the performance of long running jobs.
- Responsible to manage data coming from different sources.
- Involved in creating Pig tables, loading with data and writing Pig Latin queries which will run internally in Mapreduce way.
- Documented the systems processes and procedures for future s.
- Involved in writing Unix/Linux Shell Scripting for scheduling jobs and for writing hive QL.
- Created hive tables for the moved files in HDFS.
- Used hive for transformations on the tables.
- Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run MapReduce jobs in the backend.
- Used Oozie Scheduler system to automate the pipeline workflow and orchestrate the map reduces jobs that extract the data on a timely manner.
- Actively participated in software development lifecycle (scope, design, implement, deploy, test), including design and code reviews, test development, test automation.
- Involved in story-driven agile development methodology and actively participated in daily scrum meetings with the team of 6 people.
- Responsible for preparing the File level and Column level metadata.
- Involved in Data modeling like preparing the source and target mapping of the fields based on source data etc.
- Extensively used the Hue browser for interacting with Hadoop components.
- Responsible for moving the data from source (Teradata, Oracle) to HDFS and Unit Testing of the files moved.
- Actively involved with infrastructure teams in addressing the platform issues.
- Worked on transformation to create a new dataset using an existing dataset.
- Involved in testing activities within QA environment which include System testing, Integration testing and writing test cases.
- Created ETL Processes using Informatica.
- Used Informatica to design the mappings and ETL workflows to transfer the data from the source systems (DB2, Oracle) to HDFS.
- Created events and tasks in the workflows using workflow manager
- Designed Informatica mappings by translating the business requirements.
- Developed reusable sessions and workflows.
- Widely used Informatica client tools -- Source Analyzer, Warehouse designer, Mapping designer, Transformation Developer and Informatica Workflow Manager.
- Used workflow manager for session management, database connection management and scheduling of jobs.
Environment: Linux, Informatica, Hadoop, Sqoop, Hive, Oozie, HDFS, Pig, Hue
Confidential, Cleveland
ETL Hadoop Developer
Responsibilities:
- Worked as a production support for this project and tried to resolve issues as a part of daily operations and monthly processing as a part of oozie workflows.
- Used Hive and SQL for the enhancements of the project.
- Used hive, SQOOP, Oozie to design ETL process in order to load the data provided by the client.
- Designed Quality Assurance scripts in SQL.
- Interacted with the client directly to gather the requirements for the enhancements.
- Worked hard to act as a production support for the daily, weekly, and monthly activities.
Environment: Windows, SQL, Hadoop, Pig, Hive, Oozie
Confidential, Detroit
ETL Hadoop Developer
Responsibilities:
- Used SQL to design ETL process in order to load the input data provided.
- Used SQL to query the data in RDBMS and analyze it.
- Used Servlets to connect to the RDBMS and to display the reports in the front end.
- Used JSP Pages to design a dashboard for the client to view the reports and other data.
- Used Java to connect to RDBMS to display the current updated information of the records dynamically as requested by the client from the UI dashboard.
- Coordinating with the Subject Matter Experts on the changes and issues.
- Designed quality assurance SQL scripts for testing the ETL process.
- Generating the Monthly Reports Deck and daily reports DECK by using the automated SQL jobs and making sure all reports are generated and available on the dashboard.
- Involved in the creation of SQL Tables, Views, and SQL stored procedures and functions.
- Loading Data into the staging tables every day from which the reporting and other related queries are used.
- Upgraded the ETL process to use HDFS for the data storage and other Hadoop based tools like sqoop, oozie, hive etc. to achieve the ETL process.
- Worked on automating the ETL processes using oozie workflows.
- Worked on the Performance Issues to ensure that the hive queries execute faster using the TEZ execution engine.
- Used shell scripting to automate the daily processing.
- Worked on four ETL load types namely full, CDC, incremental and Upserts
- Experienced in the use of agile approaches, including Extreme Programming, Test-Driven Development and Scrum.
Environment: Linux, Windows, SQL, Java, SQOOP, Hive, Oozie, Informatica.
Confidential
Java Developer
Responsibilities:
- Involved right from the inception phase
- Understanding and analyzing the requirements
- Prepared screenshots for all modules.
- Coordinate with Project managers, development and QA teams during the course of the project.
- Developed Controller and Connection pool.
- Database design to all modules
- Incorporated Internationalization.
- Used Eclipse IDE for developing the applications
- Involved in generating Fingerprint based attendance for management, staff and students.
- Involved in sending Bulk SMS.
- Involved in Production Support activities for quick fixes using SQL.
- Involved in code review and testing for library management and progress report.
- Prepared documentation and participated in preparing user’s manual for the application.
- Prepared deployment plans for production deployments.
- Involved in writing client side validations using Javascript.
- Used Tomcat web server for development purpose.
- Provided daily development status, weekly status reports, and weekly development summary and defects report.
- Responsible to remove bugs and add new features, enhance performance and scalability in application.
Environment: Java, Tomcat.
Confidential
Java Developer
Responsibilities:
- Involved in the Enhancement project of the product, which involved development of Front-end module using JSP.
- Use case development and database design for all modules.
- Used Design Patterns like MVC and Singleton.
- Involving in Inventory Management.
- Involving in Code Review of Expenditure Management.
- Prepared Screenshots for all the modules.
- Gathered requirements from the users for enhancement or any issues and facilitate the transfer of the same requirements to our team.
- Work on the design and do the coding of the same.
- Creating and managing the defect reports.
- Conducted knowledge sharing sessions for the team to be a part with business rules.
- Guiding the team technically in resolving any issues being faced during coding
- Active participation in the team meetings.
- Involved in creating MYSQL stored procedures for data/business logic.
- Involved in the Enhancement project of the product, which involved development of Front-end Module using JSP.
- Performed code reviews to ensure code meets coding standards.
Environment: Core Java, JSP, Servlets, Tomcat.
Confidential
Responsibilities:
- Developed a parameterizable technique to recommend indexes based on index types.
- Reduced the time taken to answer the query.
- Involved in code review.
- Involved in User Acceptance and Use case Testing.
- Developed technical designs for application development.
- Developed application code for java programs.
- Developed all logical and physical models and deploy all applications and provide excellent documents for all processes.
- Prepared all documents for project standards and maintain accuracy in same and manage all technical resources to meet all requirements and perform tests on various processes in coordination with development teams.
- Involved in unit testing and resolving test defects.
Environment: Java, JSP, Servlets, Tomcat server 5.0, SQL Server 2000, Data mining