Sr. Hadoop Developer/data Analyst Resume
Memphis, IL
PROFESSIONAL SUMMARY:
- Overall 7+ years of professional IT experience and over 3+ Years of Big Data Ecosystem experience in ingestion, storage, querying, processing and analysis of big data using Tableau, Splunk, in parallel 1+ years’ experience in Predictive Modeling, Regression Analysis using R, and Statistical Tool development using Excel VBA.
- In depth understanding of Hadoop Architecture and various components such as HDFS, Name Node, Data Node, Resource Manager, Node Manager and YARN / Map Reduce programming paradigm and for working with Big Data to analyze large data sets efficiently.
- Experience with configuration of Hadoop Ecosystem components: Map Reduce, Hive, HBase, Pig, Sqoop, Oozie, Zookeeper, Flume, Storm, Spark, Yarn, Tez.
- Experience in importing/exporting terabytes of data using Sqoop from HDFS to RDBMS and vice - versa.
- Experience working on processing data using Pig and Hive. Involved in creating Hive tables, data loading and writing hive queries.
- Hands On experience in NoSQL database like HBase. Knowledge of job workflow scheduling and monitoring tools like Oozie and Zookeeper.
- Experience in analyzing data using HiveQL, Pig Latin, and custom Map Reduce programs in Java.
- Good knowledge on integrating the BI tools like Tableau, Splunk with the Hadoop stack and extracting the required Data for analysis.
- Experience with statistical & data analysis concepts including predictive modeling and machine learning techniques using R.
- Experience working in the Cloud environment like Amazon Web Services (AWS).
- Good knowledge on Hadoop administration activities such as installation and configuration of clusters using Apache and Cloud era.
- Extensive RDBMS experience in writing Packages, Stored Procedures, Functions, Views & Triggers using SQL, PL/SQL.
- Performed Optimization of SQL statements and Procedures using Explain Plan, table partitioning, hints etc.
- Effectively made use of Table Functions, Indexes, Table Partitioning, Analytical functions, Materialized Views, Query Re-Write and Transportable table spaces & Partitioned large Tables using range partition technique.
- Made use of Bulk Collections for optimum performance, by reducing context switching between SQL and PL/SQL engines.
- Worked extensively on Dynamic SQL, Exception handling, Ref Cursor, External Tables and Collections.
- Proficiency in core Java concepts like OOPS, Exception Handling, Generics & Collection Framework.
- Developed Rich Internet Applications using JPA, ORM, JSF, Richfaces4.0, EJB3.0, JMS, MVC architecture & REST.
- Confidential Certified AIX6.1: Basic Operations & Confidential Certified DB2 Associate: Fundamentals
- Hands-on experience with related/complementary open source software platforms and languages (e.g. Java, Linux, UNIX/AIX).
- Demonstrated success many times under aggressive project schedules and deadlines, flexible, result oriented and adapts to the environment to meet the goals of the product and the organization.
TECHNICAL SKILLS:
Hadoop: Yarn, Hive, Pig, HBase, Zookeeper, Sqoop, Oozie, Flume
BI tools: Tableau, Splunk, R
Machine Learning Algorithms: Predictive Modeling, Regression Analysis, Clustering, Decision Trees, PCA etc.
Platforms: UNIX, AIX, Windows XP/7, Linux
Languages: Java, C, Shell, Advanced PL/SQL, Python
Databases: Oracle 11g, MySQL, DB2 UDB
Architectures: MVC, SOA, Cloud Computing, Restful Web services
Frameworks: Spring, JPA, Hibernate, ORM, Java EE, JSF, EJB3.0, JUnit Testing
Tools: Eclipse, Net Beans 8.0, SQL Developer 4.0, R Studio, Tableau, Splunk, MS Office
Web Technologies: HTML5, CSS, jQuery, Ajax, JavaScript, RichFaces4.0
Methodology: Agile software development
PROFESSIONAL EXPERIENCE:
Confidential, Memphis, IL
Sr. Hadoop Developer/Data Analyst
Responsibilities:
- Manipulated, transformed, and analyzed data from various types of databases.
- Upgraded existing analytical tools and application systems developed on static platforms.
- Tableau based data analysis and visualization, dashboard creation etc.
- Prepared info-graphics to present the results of some of market research projects and supported dissemination activities.
- Worked on development and application of software to extract, transform and analyze a variety of unstructured and structured data.
- Used R for predictive modeling and regression analysis.
- Worked extensively in creating Map Reduce jobs to power data for search and aggregation.
- Designed a data warehouse using Hive.
- Worked extensively with Sqoop for importing data from Oracle.
- Extensively used Pig for data cleansing.
- Created partitioned tables in Hive.
- Worked with business teams and created Hive queries for ad hoc access.
- Evaluated usage of Oozie for Workflow Orchestration.
- Mentored analyst and test team for writing Hive Queries.
Environment: Hadoop, Map Reduce, HDFS, Hive, Java,CDH, Oozie, Oracle 11g/10g, Tableau, R, Excel VBA
Confidential, Los Angeles, CA
Hadoop Developer
Responsibilities:
- Designed, developed and tested a package of software applications (PIG scripts, Hive, UDFs and Map Reduce jobs) for fusing server data tagged by web clients.
- Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase database and Sqoop.
- Processing and deploying complex and large volume of structured and unstructured data coming from multiple systems.
- Responsible for building scalable distributed data solutions using Hadoop.
- Worked with application teams to install operating system,Hadoop updates, patches, version upgrades as required.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Developed Map Reduce jobs to automate transfer ofdatafrom HBase.
- Daily Monitoring Status and Health of Cluster include Data Node, Job Tracker, Talk Tracker, and Name Node using Unix/Linux Commands.
- Huge Data Analytics using Hadoopand NoSQL Database Technologies Managed mentored and guided a team of junior developers.
- Developed Scripts using PIG Latin and executed with Grunt Shell.
- Involved in conducting code and design reviews to ensure the team following the same standards.
- Worked with Reporting and Statistics team to build nice reports.
- Perform data analysis using Hive and Pig.
- Created HBase tables to store various data formats coming from different applications.
- Responsible for Technical Specification documents.
Environment: Java JDK, Eclipse, CDH 4, YARN, Map Reduce, HDFS, ApacheHadoop, PIG Latin,HadoopClusters, Hive, Sqoop, Zookeeper, Oracle etc.
Confidential, Peoria, IL
Hadoop Developer/Administrator
Responsibilities:
- Installed and configured Hadoop Map Reduce, HDFS, Yarn.
- Developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Experienced in defining job flows.
- Experienced in managing and reviewing Hadoop log files.
- Extracted files from Couch DB through Sqoop and placed in HDFS and processed.
- Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
- Load and transform large sets of structured, semi structured and unstructured data.
- Responsible to manage data coming from different sources.
- Got good experience with NoSQL database.
- Supported Map Reduce Programs those are running on the cluster.
- Involved in loading data from UNIX file system to HDFS.
- Installed and configured Hive and also written Hive UDFs.
- Involved in creating Hive tables, loading with data and writing Hive queries which will run internally in map reduce way.
Environment: Java 6, Eclipse, Linux,Hadoop, HBase, Sqoop, Pig, Hive, Map Reduce, HDFS, Flume, XML, SQL, MySQL
Confidential
System Engineer
Responsibilities:
- Performed data backup, new server configuration/SSH setup and crontab jobs setup during server migration.
- Supervised multiple AIX servers in EMEA/NA/AP involving high reliability, business critical systems in case of huge updates, low memory and job failure that have business impact.
- Created Shell Scripts for several jobs & have experience of scheduling/monitoring 100+ crontab jobs per day.
- Monitoring & providing 24X 7 support
- Working with the pre-production team to oversee the new data feed release
- Participating in IGS’s Executive Alert process for Severity 1 Problems.
- Maintaining and running data feed production jobs.
- Worked on the back-end of Confidential e-commerce web application on AIX platform.
- Processed XML data on daily basis and written scripts for resolving errors.
- Develop scripts for smooth transition of logs during version upgrades and server migration.
- Performed Optimization of SQL statements and Procedures using Explain Plan, table partitioning, hints etc.
- Effectively made use of Table Functions, Indexes, Table Partitioning, Analytical functions, Materialized Views, Query Re-Write and Transportable table spaces & Partitioned large Tables using range partition technique.
- Made use of Bulk Collections for optimum performance, by reducing context switching between SQL and PL/SQL engines.
- Worked extensively on Dynamic SQL, Exception handling, Ref Cursor, External Tables and Collections.
- Designed and developed the Extraction, Transformation, and Loading (ETL) program to move data from source to target, mapping source data attributes to target data attributes using Informatica.
- Created matrix reports, visualization charts and dashboards to analyze batch performance using MS Excel VBA.
- Coordinated with BAM and Delivery Managers to deliver the deliverables within SLA.
- Wrote several SQL scripts for XML data processing purposes, which used to get invoked by shell scripts.
Environment: AIX/UNIX, Oracle PL/SQL, MS Excel, RPM, Lotus Notes
Confidential
Application Developer
Responsibilities:
- Implemented the project using ORM, JPA, JSF, EJB3.0 & MVC architecture.
- Java Mail used to notify users by email after successful ticket booking. - Incorporated security using
- Glass Fish Realm features to add roles in order to provide not only authentication but authorization.
- Deployed this application on AWS cloud services.
- Developed Restful API's to maintain runtime dependencies of 2 different applications & written methods to perform parsing of XML request and generating XML response.
- To write a shell script which runs on AWS CLI and sets up the 3 tier architecture for backend on AWS cloud (Ubuntu) like EC2 instances, ELB, RDS, S3 bucket, SQS, SMS/SES with minimal human intervention.
- Shell script (AWS CLI) which runs on AWS Ubuntu and create AWS EC2 instances, AWS RDS, AWS ELB.
- Used AWS SMS service to send email alert to users when compression of image get done. Features:
- Application has 3-tier architecture.
- Users can create their account to login.
- Admin has different UI.
Environment: XML, Java, Ubuntu, JSP, JSF, MVC Architecture, MS Word, MS Outlook, Eclipse, UNIX