Big Data/Hadoop Developer Resume Minneapolis, MN - Hire IT People

SUMMARY

Over 7 years of professional IT experience and experience in the Big Data ecosystem related technologies.
3 plus years of experience in Big Data Technologies.
In depth understanding of Hadoop Architecture and its components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce.
Experience in using Hortonworks, Cloudera Hadoop ecosystems and its components like Hadoop Map Reduce, HDFS, HBase, Oozie, Hive, Sqoop, Pig, Hue, Zookeeper and flume.
Experience in reviewing Hadoop log files to detect node failures.
Experience in analyzing data using HiveQL, Pig Latin, HBase and custom MapReduce programs in Java.
Worked with multiple file Input Formats such as TextFile, KeyValue, SequenceFile and NLine input format.
Experience in working with multiple file formats JSON, XML, Sequence Files and RC Files.
Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
Experience using Talend Integration Suite (5.0/5.5/6.1)/Talend Open Studio (5.0/5.5/6.1).
Extending Hive and Pig core functionality by writing custom UDF’s.
Experience in scheduling recurring Hadoop jobs using Apache Oozie workflows.
Very good understanding on NOSQL databases like mongoDB, HBase and Cassandra.
Worked on real-time, in-memory processing engines such as Spark, Impala and integration with BI Tools such as Tableau.
Good Knowledge in creating event processing data pipelines using Kafka and Spark Streaming.
Experienced in loading log data into HDFS by collecting and aggregating the data from various sources using Flume.
Experience in data management and implementation of Big Data applications using Hadoop frameworks.
Knowledge of the design and implementation of the Data Warehouse life cycle.
Knowledge of Data Warehouse/Data Mart design concepts.
Familiar with data architecture including data ingestion pipeline design, Hadoop information architecture, data modeling and data mining, machine learning and advanced data processing. Experience optimizing ETL workflows.
Experience in designing, developing and implementing connectivity products that allow efficient exchange of data between our core database engine and the Hadoop ecosystem.
Experience in various programming languages like C, C++, Java/J2EE, Python, Scala, PL/SQL.
Expertise in RDBMS like Oracle, MS SQL Server, MySQL, Greenplum and DB2.
Experience in UNIX and shell scripting.
Experience in developing and applying Machine Learning algorithms to Big Data
Experience in Agile Engineering practices.
Good knowledge of GITHUB and Jenkins in Automated Deployments.
Techno-functional responsibilities include interfacing with users, identifying functional and technical gaps, estimates, designing custom solutions, development, leading developers, producing documentation, and production support.
Excellent interpersonal and communication skills, creative, research-minded, technically competent and result-oriented with problem solving and leadership skills.

TECHNICAL SKILLS

Languages: C, C++, Java/J2EE, Python, Scala, PL/SQL, Bash.

Big Data Technologies: HDFS, Mapreduce, Pig, Hive, Sqoop, Flume, Oozie, Zookeeper, flume, Yarn, Spark.

Data Stacks: Apache Spark, Apache Hadoop, Oracle, MySQL, MS SQL Server, Greenplum.

NoSQL Databases: Hbase, MongoDB, Cassandra.

Java&J2EE, Web Technologies: JavaScript, JSF, Ajax, JSP, Servlets, Java Beans, JDBC, EJB, JMS, HTML, XML, CSS.

OS: MS-Windows XP/7, Linux, Unix, Mac OS X.

IDEs: Eclipse, Sublime Text, Notepad++, Visual Studio, Putty.

PROFESSIONAL EXPERIENCE

Confidential, Minneapolis, MN

Big Data/Hadoop Developer

Responsibilities:

Developed hive queries on clickstream data to perform analysis of a Confidential user behavior on various online modules.
Implemented Partitioning and bucketing in Hive and optimizing the hive queries.
Developed the Pig UDF'S to pre-process the data for analysis.
Extensively used Pig for data cleansing.
Developed Map Reduce programs for some refined queries on big data.
Optimized Map Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
Worked with various HDFS file formats like Avro, SequenceFile, text file and various compression formats like Snappy, bzip2.
Loaded data into HDFS and extracted the data from Teradata into HDFS using Sqoop.
Developed Sqoop scripts to import and export the data from relational sources and handled incremental loading on the customer data by date.
Developed Hadoop streaming Map/Reduce works using Spark.
Used Spark for logistic regression as well as linear regression and various machine learning algorithms.
Developed Spark SQL scripts to perform analysis on the data from third party vendors.
Experience in the field of Enterprise Data Warehousing (EDW) and Data Integration.
Developed a GraphX solution using Spark to inter-relate several users based on their behavior and different id’s.
Developed a data pipeline using Kafka to store data into HDFS.
Exported data from Kafka topics to HDFS files in a variety of formats and integrated with Hive to make data immediately available for querying with HiveQL using HDFS connector.
Used Oozie to automate/schedule business workflows which invoke HiveQL, Sqoop, MapReduce and Pig jobs as per the requirements.
Experienced in Building a Talend job outside of a Talend studio as well as on TAC server.
Developed Simple to complex Map/reduce Jobs using SQL.
Mentored analyst and test team for writing Hive Queries.
Experience in reviewing Hadoop log files to detect failures.
Loaded data into the cluster from dynamically generated files using Flume.
Developed reports on various hive tables by connecting Tableau server to Hadoop for data analytics purpose.

Environment: Hortonworks Hadoop, MapReduce, HDFS, Hive, Java, Pig, Linux, HBase, Zookeeper, Sqoop, Flume, Oozie, kafka, Talend, Tableau, Spark, Scala, PL/SQL.

Confidential, Durham, NC

Hadoop Developer

Responsibilities:

Handled importing of data from various data sources, performed data transformations using HAWQ, Map Reduce.
Involved in creating Hive Internal and External tables, loading data and writing hive queries which will run internally in map reduce way.
Implemented complex map reduce programs to perform joins on the Map side using Distributed Cache in Java.
Designed and implemented Customization of Keys, Values, Partitioners, Combiners, InputFormats and RecordReaders in JAVA.
Developing Scripts and Batch Jobs to schedule various Hadoop Programs.
Implemented UDFS, UDAFS, UDTFS in java for hive to process the data that can’t be performed using Hive inbuilt functions.
Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
Worked on complex data types Array, Map and Struct in Hive.
Analyzed the data by performing Hive queries and running Pig scripts to know user behavior.
Analyzed JSON and XML files using Hive Built in functions and SerDe’s.
Transformed the log files into structured data using Hive SerDe’s and Pig Loaders.
Parsed JSON and XML files in PIG using Pig Loader functions and extracted meaningful information from Pig Relations by providing a regex using the built-in functions in Pig.
Extensively used Pig for data cleansing.
Exported the analyzed data to the relational databases using Sqoop and generated reports for the BI team.
Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce Hive, Pig, and Sqoop.
Deployed and configured Flume agents to stream log events into HDFS for analysis.
Familiarity in using NoSQL database, HBase on top of HDFS.
Load and transform large sets of structured, semi structured using Hive and Impala.
Connected Hive and Impala to Tableau reporting tool and generated graphical reports.

Environment: Pivotal HD, MapReduce, EDW, HDFS, Hive, Java, Pig, Linux, XML, JSON, HBase, Zookeeper, Sqoop, Flume, Oozie, Impala, Tableau, My SQL, putty.

Confidential, Patskala, OH

Hadoop Developer

Responsibilities:

Developed multiple MapReduce Jobs in java for data cleaning and pre-processing.
Developed efficient MapReduce programs for filtering out the unstructured data.
Experience on loading and transforming of large sets of structured, semi structured and unstructured data.
Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
Developed unit test cases for mapper, reducer and driver classes.
Developed Hive queries for data sampling and analysis to the analysts.
Involved in creating Hive tables, loading with data and writing Hive queries that will run internally in map reduce way.
Involved in developing Pig scripts.
Used Pig as ETL tool to do transformations, even joins and some pre-aggregations before storing the data onto HDFS.
Experience in migrating the Data warehouse from oracle to teradata.
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Involved in moving all log files generated from various sources into Hadoop HDFS using Flume for further processing.
Good Knowledge of analyzing data in HBase using Hive and Pig. Experienced in defining job flows using Oozie.
Used Agile/Scrum method for requirements gathering.
Developed Java Map Reduce programs using Mahout to apply on different datasets.
Extensive usage of Maven for building jar files of Map Reduce programs and deployed to cluster.
Identified several PL/SQL batch applications in General Ledger processing and conducted performance comparison to demonstrate the benefits of migrating to Hadoop.
Experienced in managing and reviewing Hadoop log files.
Configured Sentry to secure access to purchase information stored in Hadoop.
Involved in several POCs for different LOBs to benchmark the performance of data-mining using Hadoop.

Environment: CloudEra Hadoop, MS SQL Server, Oracle, Hadoop CDH 3/4/5, PIG, Hive, ZooKeeper, Mahout, HDFS, HBase, Sqoop, Java, Oozie, Hue, Tez, UNIX Shell Scripting, PL/SQL, Maven, Ant.

Confidential, Raleigh, NC

Application Developer J2EE

Responsibilities:

Developed JavaScript behavior code for user interaction.
Created database program in SQL server to manipulate data accumulated by internet transactions.
Wrote Servlets class to generate dynamic HTML pages.
Developed Servlets and back-end Java classes using Web Sphere application server.
Developed an API to write XML documents from a database.
Performed usability testing for the application using JUnit Test.
Maintenance of a Java GUI application using JFC/Swing.
Created complex SQL and used JDBC connectivity to access the database.
Involved in the design and coding of the data capture templates, presentation and component templates.
Part of the team that designed, customized and implemented metadata search and database synchronization.
Used Oracle as Database and used Toad for queries execution and also involved in writing SQL scripts, PL SQL code for procedures and functions.

Environment: Java, Web Sphere 3.5, EJB, Servlets, JavaScript, JDBC, SQL, JUnit, Eclipse IDE, Apache Tomcat 6.

Confidential

JAVA Developer

Responsibilities:

Responsible and active in the analysis, design, implementation and deployment of full Software Development Lifecycle (SDLC) of the project.
Designed and developed user interface using JSP, HTML and JavaScript.
Developed Struts action classes, action forms and performed action mapping using Struts framework and performed data validation in form beans and action classes.
Extensively used Struts framework as the controller to handle subsequent client requests and invoke the model based upon user requests.
Defined the search criteria and pulled out the record of the customer from the database. Make the required changes and save the updated record back to the database.
Validated the fields of user registration screen and login screen by writing JavaScript validations.
Developed build and deployment scripts using Apache ANT to customize WAR and EAR files.
Used DAO and JDBC for database access.
Developed stored procedures and triggers using PL/SQL in order to calculate and update the tables to implement business logic.
Design and develop XML processing components for dynamic menus on the application.
Involved in postproduction support and maintenance of the application.

Environment: Oracle 11g, Java 1.5, Struts, Servlets, HTML, XML, SQL, J2EE, Junit,Tomcat 6.

We provide IT Staff Augmentation Services!

Big Data/hadoop Developer Resume

Minneapolis, MN

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship