Big Data Consultant Resume
NC
SUMMARY:
- 8+ years of IT experience in Big Data Analytics, Hadoop and Java development.
- Hands on experience in designing, testing and deploying Map Reduce applications in a Hadoop Ecosystem.
- Hands on experience in installing, configuring and using the Hadoop ecosystem components such as Map Reduce, HDFS, Pig, Hive, Sqoop, Flume, Oozie, HBase.
- Experience in manipulating/analyzing large datasets and finding patterns and insights within structured and unstructured data.
- Good understanding of NoSQL databases and hands on work experience in writing applications on NoSQL databases like HBase, Cassandra and MongoDB .
- Experience in Oozie and workflow scheduler to manage Hadoop jobs by Direct AcyclicGraph (DAG) of actions with control flows.
- Strong experience creating real time data streaming solutions usingApache Spark Core, Spark SQL & Data Frames, Spark Streaming, Apache Storm, Kafka.
- Hands - on experience with systems-building languages such asScala, Java
- Hands-on experience with message brokers such asApache Kafka and RabbitMQ.
- Experienced in working with Hadoop clusters using Cloudera, HortonWorks, MapRdistributions.
- Experienced in working with Amazon Web Services (AWS) using EC2 for computing and S3 as storage mechanism.
- Solid experience in importing and exporting data from and into HDFS using Sqoop, further processing the schema oriented or non-schema oriented data using Pig.
- Strong experience in collecting and storing log data in HDFS using Apache Flume.
- Strong Experience in all the phases of software development life cycle (SDLC) including requirements gathering, analysis, design implementation and support.
- Experienced in using agile approaches, including Extreme Programming, Test-Driven Development and Agile Scrum.
- Experience in strong and analyzing data using HiveQL, HBase and custom Map Reduce programs in Java.
- Experience with development and design of solutions using Java in a test driven approach.
- Excellent Object Oriented Programming (OOP) skills with C++and Java and in-depth understanding of data structures and algorithms.
- Hands-On experience on SQL, My-SQL.
- Strong statistical, mathematical and predictive modeling skills and experience.
- Strong command over relational databases: MySQL, Oracle, SQL Server and MS Access.
- Have knowledge on Python.
- Experience in writing SQL queries and functions.
- Ability to work independently and with a group of peers in a results-driven environment.
- Strong analytical and problem solving skills. Ability to take initiative and learn emerging technologies and programming languages.
TECHNICAL SKILLS:
Technologies: HDFS, YARN, MapReduce, Hive, Pig, Sqoop, HBase, Flume, Oozie,Zookeeper, Teradata,Spark,Impala,Kafka, Storm, Drill,.
Hadoop Platforms: Cloudera, HortonWorks and MapR distributions.
Programming Languages: C++, Java, Python.
Java/J2EE: Java 5/6, Spring 3.0, Eclipse, NetBeans, JDBC
Database: MySQL, Oracle, SQL Server, MS Access,DB2.
Operating Systems: Windows, Linux, UNIX, Mac OS.
PROFESSIONAL EXPERIENCE:
Confidential, NC
Big Data Consultant
Responsibilities:
- Experience with professional software engineering practices and best practices for the full software development life cycle using Hadoop including coding standards, code reviews, source control management and build processes.
- Developed framework to import and export data from various sources like Teradata, Oracle, SQL server and Flat-files into HDFS.
- Used Spark-Streaming APIs to perform necessary transformations and actions on the fly for building the common learner data model which gets the data from Kafka in near real time and Persists into Cassandra.
- Configured deployed and maintained multi-node Dev and Test Kafka Clusters.
- Developed Spark scripts by using Scala shell commands as per the requirement.
- Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
- Developed Scalascripts, UDFFs using both Data frames/SQL/Data sets and RDD/MapReduce in Spark 1.6 for Data Aggregation, queries and writing data back into OLTP system through Sqoop.
- Extensively used Sqoop to connect with various databases to import data into Hive.
- Expertise in deciding the number of mappers for the Sqoop import and exports
- Developed scripts to import files using Web HDFS approach.
- We have written set of multiple Map Reduce jobs to parse terabytes of customer data inPython and shell and converted the data to csv format
- Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
- Worked with parsing XML files using Map reduce to extract sales related attributed and store it in HDFS
- Exposure to different data preparation stages such as staging and journaling.
- Developed scripts in Shell and Python to propagate data from Source ->Preparation->Stage->Journal
- Very good experience with Hive data types and casting and also worked HIVE transactional properties like UPDATE and DELETE.
- Proactively involved in tuning the complex Hive queries
- Implemented Partitioning, Dynamic Partitions, Buckets in HIVE .
- Extending Hive and Pig core functionality by writing custom UDFs.
- Worked on debugging, performance tuning of Hive&Pig Jobs.
- Developed Pig Latin scripts for data cleansing and data transformation.
- Used HBASE shell to perform DDL and DML operation like Create, Drop tables etc. and Loaded data into HBASE from HDFS using API’s.
- Automated scripts using automicUC4.
- Experienced in managing and reviewing Hadoop log files.
- Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis
- Designed, documented operational problems by following standards and procedures using a software-reporting tool JIRA.
ENVIRONMENT: HDP2.X.X, Flume, Hive, Sqoop, Map reduce, Hbase, Web-Hdfs, Pig, Oozie, uc4,Sql, Teradata, Hadoop YARN, Spark Core, Spark Streaming, Spark SQL, Java, Linux, CentOS, Python and Shell scripting.
Confidential, Cary, NC
Big Data Consultant
Responsibilities:
- Proactively monitored systems and services, architecture design and implementation of Hadoop deployment, configuration management, backup, and disaster recovery systems and procedures.
- Designed and developed scalable custom Hadoop solutions as per dynamic data needs.
- Supported technical team members in management and review of Hadoop log files and data backups.
- Participated in development and execution of system and disaster recovery processes.
- Involved in analyzing system failures, identifying root causes, and recommended course of actions.
- Consumed the data from Kafka using Apache spark.
- Load and transform large sets of structured, semi structured and unstructured data.
- Involved in loading data from LINUX file system to HDFS
- Importing and exporting data into HDFS and Hive using Sqoop.
- Worked on batch processing data using Apache Hadoop, Map Reduce and Apache Pig.
- Experience in real-time Big Data solutions using Hbase handling billions of records.
- Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
- Exported the result set from HIVE to MySQL using Shell scripts.
- Extending HIVE and PIG core functionality by using custom User Defined Function’s (UDF), User Defined Table-Generating Functions (UDTF) and User Defined Aggregating Functions (UDAF) for Hive and Pig using python.
- Load log data into HDFS using Flume, Kafka and performing ETL integrations.
- Integrated Ooziewith the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (like JavaMapReduce, Pig, Hive, Sqoop) as well as system specific jobs (such as Java programs and shell scripts).
- Managed and reviewed Hadoop log files
- Assisted in designing, development and architecture of Hadoop and HBase systems.
- Coordinated with technical teams for installation of Hadoop and third related applications on systems.
- Formulated procedures for planning and execution of system upgrades for all existing Hadoop clusters.
- Supported technical team members for automation, installation and configuration tasks.
- Suggested improvement processes for all process automation scripts and tasks.
- Worked with Teradata Appliance team, HortonWorks PM and Engineering Team, Aster PM and Engineering team.
ENVIRONMENT: HDP2.X.X, Flume, Hive, Sqoop, Map reduce, Hbase, Kafka, Apache Spark, Web-Hdfs, Pig, Oozie, uc4,Sql, Teradata, Java, Linux, CentOS, Python, Horton Works, MySQL.
Confidential, Tempe, AZ
Hadoop Developer
Responsibilities:
- Developed self-contained micro-analytics on Search platform, leveraging the content and prior capabilities associated with the MRL Search Pilot.
- Developed search/analytics product features and fixes following an Agile methodology.
- Developed test cases for codebase using JMockit API.
- Wrote Hive queries for ingesting and indexing data from Merck Chemical Identifier Database (MCIDB) which is used by Merck scientists for performing research activities throughout the Drug Discovery phase.
- Developed Simple to complex Map Reduce Jobs using Hive and Pig
- Supported Map Reduce Programs those are running on the cluster
- Worked on Hadoop Map reduce to process history data in batches using Hbase as a scalable data store
- Exported analyzed data to Oracle database using Sqoop for generating reports.
- Worked with SQOOP import and export functionalities to handle large data set transfer between DB2 database and HDFS.
- Built aggregations using HIVE queries and implemented transformation rules using PIG Latin scripts.
- Developed custom UDFs in Hive for enhancing search capabilities by using Chemistry Development Kit (CDK) Library.
- Created cron jobs to ingest and index data periodically.
- Developed Puppet scripts to install Hive, Sqoop, etc on the nodes.
- Actively took part in daily scrum meetings, bi-weekly sprint planning and closeout meetings.
- Worked with highly engaged Informatics, Scientific Information Management and enterprise IT teams.
ENVIRONMENT: Hadoop, Map reduce, Hbase, Hive, Pig, Sqoop, Puppet.
Confidential, Tempe, AZ
Java/J2EE Developer
Responsibilities:
- Developed presentation components using JSP, Struts tag library, JSTL, JavaScript, CSS, XML, and HTML.
- Developed struts action, form beans, business object and adapter components for the controller and model layers of the MVC pattern.
- Developed business validation scripts for the UI using Forms and validation.xml struts components.
- Used XMLBeans to communicate with enterprise web services from UI layer for accessing enterprise business data.
- Involved in developing and integrating web services - customer view, account view, set lending data, get lending data, get credit report, search credit reports, search application, customers, product recommendations etc., services using XML, XMLBeans, XML Schema, SOAP, WSDL.
- Designed Java components and integrated using spring framework for Hibernate Object/Relational persistence mechanism.
- Used XML Web Services using SOAP to transfer the amount to transfer application that is remote and global to different financial institutions.
- Used XML parser APIs such as JAXP(SAX) and JAXB in the web service's request response data marshalling as well as unmarshalling process.
- Worked with Business Analysts to convert the business requirements into technical specifications and implementation
- Used CVS for version control across common source code used by developers.
- Assisted Technology QA team to test the application and integration testing.
Environment: Core Java, J2EE, JSP, Servlets, Akka, XML, XSLT, EJB 3.0, JDBC, Akka and Scala, JBuilder 8.0, JBoss, Swing, JavaScript, JMS, HTML, CSS, MySQL Server, CVS, Windows 2000
Confidential, Novato, CA
Java Developer
Responsibilities:
- Involved in Analysis, Design and Development of Insurance plans using AgileMethodology.
- Deployed the applications on IBM Web Sphere Application Server.
- Used Oracle10g as backend data base.
- Involved in the Bugfixingojf various applications reported by the testing teams in the application during the integration and used Bugzilla for the bug tracking.
- Used TortoiseCVS as version control across common source code used by developers.
- Developed user interface using Struts Tiles Framework, JSP, HTML, XHTML and Java Script to simplify the complexities of the application.
- Used Ajax for intensive user operations and client-side validations.
- Check, identify and design for the effective changes in the backend service.
- Developed and designed the front end using XML, XSLT, HTML, and CSS.
- Main code changes were done in JSP, JQuery, and AJAX and JavaScript environment.
- Migrated UI screens from JSP to JSF2 using with third party library Prime faces.
- Implemented JSF bases custom validators for UI components.
- Performed client side as well as Server side Validations using Java Script and Spring Validation.
- Used Web Services for creating rate summary and used WSDL and SOAP messages for getting insurance plans from different module and used XML parsers for data retrieval.
- Used SQL statements and procedures to fetch the data from the database.
- Developed Ant Scripts for the build process and deployed in IBM WebSphere.
- Implemented Log4J for Logging Errors, debugging and tracking using loggers, appenders components.
- Used Mongo DB as the database for persisting/storing the JSON data.
Environment: Java 1.6, Spring Framework 3.0, JSP, JDBC 4.0, WebLogic application server 10.3,Tomcat 7 Server, JNDI, JMS,XML, XSLT, SAX, JavaScript, Rational Rose UML, CVS, Log4J 1.2, JUnit 4.0, shell scripting, Sun Solaris Unix OS and Oracle 10g.
Confidential
Java Developer
Responsibilities:
- Involved in analysis and design of the application.
- Understanding business functions designed & developed the application framework.
- Web based call Centre solution design, development and implementation using Php, JavaScript, Ajax and Html.
- Involved in interacting with end users for requirement analysis.
- Involved in connecting database with Servlets and JDBC.
- Involved in writing complex multi-table joins and conditional queries in Database.
- Extensively worked with Core Java Collection classes like Array List, Hash Map and Iterator etc.
- Involved database design and responsible for creating and modifying Database objects.
- Developed SP, Functions, SQL statements for performing all transactions in Oracle Database.
- Responsible for Production Support (Includes Bug fixing, Content Changes)
- Web based call Centre solution design, development and implementation using Php, JavaScript, Ajax and Html.
- Development of User Interfaces for different modules of the product like User Creation, modifications.
- Using MySqlDatabase, maintenance of user data and module data and calling data, creating tables, modifications as per requirements, creating primary, foreign key and unique keys and maintain data as per requirements.
- Development and customization of Reports as per customer requirement for Administrator analysis purpose, using php-mysql connections fetching records from mysql database and present to the client.
- By using Joins, data has to retrieved accordingly as per required data format and present them in specific format using php scripts.
- Handled projects across different verticals of business like Banking, Insurance,Healthcare and Government projects.
- Using MySql, PHP5.3.3, Java Script, Asterisk in Linux Environment.
Environment: Java/J2EE, Core Java, Servlets, JSP, JDBC, EJB2.0, PL/SQL, Oracle, Web logic, and Windows 2000.