Hadoop Admin/Developer Resume Columbus, OH - Hire IT People

PROFESSIONAL SUMMARY:

Over 7 years of experience spread across Hadoop, Java and ETL, that includes extensive experience into Big Data Technologies and in development of standalone and web applications in multi - tiered environment using Java,Hadoop, Hive, HBase, Pig
Good work experience on large-scale systems development projects, especially enterprise distributed systems.
Strong knowledge on JCL
Very good understanding of Hadoop ecosystems like Sqoop, Spark and YARN.
Strong Working experience on rule-based decision making, information-parsing and complex data processing using schematron and drools.
Experience in Data Analysis, Data Validation, Data Verification, Data Cleansing, Data Completeness and identifying data mismatch.
Experience in working with MR, PIG scripts &HIVE query Language.
Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa
In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce concepts
Extending Hive and Pig core functionality by writing custom UDFs
Experience in analyzing data using Hive QL, Pig Latin, and custom MapReduce programs in Java.
Extensive experience with SQL, PL/SQL, PostgreSQL and database concepts
Knowledge of NoSQL, Mongo DB such as HBase and Cassandra
Knowledge of job workflow scheduling and monitoring tools like Oozie and Zookeeper
Experience in Amazon AWS cloud services (EC2, EBS, S3, SQS)
Utilized Storm for processing large volume of datasets.
Exposure to administrative tasks such as installing Hadoop and its ecosystem components such as Hive and Pig
Monitoring and support through Nagios and Ganglia.
Experience in cluster automation using Shell Scripting
Handled several techno-functional responsibilities including estimates, identifying functional and technical gaps, requirements gathering, designing solutions, development, developing documentation, and production support
An individual with excellent interpersonal and communication skills, strong business acumen, creative problem solving skills, technical competency, team-player spirit, and leadership skills
Hands on experience in Scala, Kafka and Strom.
Experience in implementing Spark using Scala and SparkSQL for faster analyzing and processing of data.

TECHNICAL SKILLS:

Hadoop/Big Data: HDFS, MapReduce, Pig, Hive, Impala, HBase, Casandra, Sqoop, Oozie, Zookeeper, Flume

Java & J2EE Technologies: Core Java

IDE Tools: Eclipse, NetBeans

Programming languages: COBOL, Java, KSH & Mark up Languages

Databases: Oracle, MySQL, DB2, IMS, PostgreSQL

Operating Systems: Windows 95/98/2000/XP/Vista/7, Unix

Reporting Tools: Tableau

Other Tools: Putty, WINSCP, EDI(Gentran), Streamweaver, Compuset

WORK EXPERIENCE:

Confidential, Columbus, OH

Hadoop Admin/Developer

Responsibilities:

Worked on Hadoop cluster scaling from 4 nodes in development environment to 8 nodes in pre-production stage and up to 24 nodes in production.
Involved in complete Implementation lifecycle, specialized in writing custom MapReduce, Pig and Hive programs.
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Extensively used Hive/HQL or Hive queries to query or search for a particular string in Hive tables in HDFS.
Possess good Linux andHadoopSystem Administration skills, networking, shell scripting and familiarity with open source configuration management and deployment tools
Experience in developing customized UDF's in java to extend Hive and PigLatin functionality.
Created HBase tables to store various data formats of data coming from different sources.
Use Maven to build and deploy code in Yarn cluster
Good knowledge on building Apache spark applications using Scala.
Developed several business services using Java REST ful web services using Spring MVC framework
Managing and scheduling Jobs to remove the duplicate log data files in HDFS using Oozie.
Used Apache Oozie for scheduling and managing the Hadoop Jobs. Knowledge on HCatalog for Hadoop based storage management.
Expert in creating and designing data ingest pipelines using technologies such as springIntegration, ApacheStorm-kafka
Used Flume extensively in gathering and moving log data files from Application Servers to a central location in Hadoop Distributed File System (HDFS).
Implemented test scripts to support test driven development and continuous integration.
Dumped the data from HDFS toMYSQLdatabase and vice-versa using SQOOP
Responsible to manage data coming from different sources.
Experienced in Analyzing Cassandra database and compare it with other open-source NoSQL databases to find which one of them better suites the current requirements.z
Used File System check (FSCK) to check the health of files in HDFS.
Developed the UNIX shell scripts for creating the reports from Hive data.
Experienced on loading and transforming of large sets of structured, semi structured and unstructured data.
Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
Used JAVA, J2EE application development skills with Object Oriented Analysis and extensively involved throughout Software Development Life Cycle (SDLC)
Extensively used Sqoop to get data from RDBMS sources like Teradata and Netezza.
Create a complete processing engine, based on Cloudera's distribution
Involved in collecting metrics for Hadoop clusters using Ganglia and Ambari.
Extracted files from CouchDB, MongoDB through Sqoop and placed in HDFS for processed
Spark Streaming collects this data from Kafka in near-real-time and performs necessary transformations and aggregation on the fly to build the common learner data model and persists the data in NoSQL store (Hbase).
Configured Kerberos for the clusters

Environment: Java,UNIX, HDFS, Pig, python Hive, MapReduce,Sqoop, Spring, NoSQL DB’s, Cassandra, Hbase, AWS, LINUX, Chef, Flume,Hortonworks,Maven, Oozie, Spark, Yarn, Shell scripting, JCL

Confidential, Piscataway NJ

Hadoop Developer

Responsibilities:

Involved in complete Implementation lifecycle, writing custom MapReduce, Pig and Hive programs.
Exported the analyzed data to the RDBMS using Sqoop for visualization and to generate reports for the BI team.
Used Hive/HQL or Hive queries to query or search for a particular string in Hive tables ofHDFS.
Did various performance optimizations like using distributed cache for small datasets, Partition, Bucketing in hive and Map Side joins.
Installed and configured Hive and also written Hive UDFs
Managed HBase tables to store various data formats of data.
Implementing a technical solution on POC's, writing programming codes using technologies such asYarn, Python and Microsoft SQL Server.
Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs
Managed Amazon Web Services (AWS) EC2 withPuppet
Utilized Python regular expressions operation (NLP) to analysis customer review
UsedFlume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS
Monitoring Hadoop cluster using tools like Nagios, Ganglia, Ambari and Cloudera Manager
Experienced developingHadoopintegrations for data ingestion, data mapping and data process capabilities.
Managed data coming from different portfolios.
Experienced in Analyzing Cassandra database and compare it with other open-source NoSQL databases to find which one of them better suites the current requirements.
Expertise inAWSdata migration between different database platforms like SQL Server to Amazon Aurora using RDS tool
Create and consume REST ful web services with NodeJs and MongoDB
Supported tuple processing, writing data with Storm by provide Storm-Kafka connectors
Utilized Java and MySQL from day to day to debug and fix issues with client processes
Monitor System health and logs and respond accordingly to any warning or failure conditions.
Developed the scripts to automate routine DBA tasks using Linux/UNIX Shell Scripts (i.e. database refresh, backups, monitoring etc.).
Real streaming the data using Spark with Kafkaand store the stream data to HDFS using Scala.
Analyzed large amounts of datasets to determine optimal way to aggregate and report.

Environment: HadoopHortonworks, Java,Python, UNIX, HDFS, Chef, Pig,Hive, MapReduce, Sqoop, NoSQL DB’s, Cassandra, Hbase, Maven,LINUX, Flume, Oozie

Confidential, Schaumburg, IL

Hadoop Admin andDeveloper

Responsibilities:

Installed and configured HadoopMapReduce, HDFS and developed multiple MapReduce jobs in Java for data cleansing and pre-processing
Importing and exporting data into HDFS and Hive using Sqoop
Proactively monitored systems and services, architecture design and implementation of Hadoopdeployment, configuration management, backup and disaster recovery systems and procedures
Developed Spark jobs usingScala in test environment for faster data processing and used Spark SQL for querying
Worked on NoSQL database including MongoDB,Cassandra and HBase.
Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS
Used Storm for the real time procession of data
Developed Puppet scripts to install Hive, Sqoop, etc. on the nodes
Data back up and synchronization using Amazon Web Services
Worked on Amazon Web Services as the primary cloud platform
Supported Map Reduce Programs those are running on the cluster
Load log data into HDFS using Flume, Kafka and performing ETL integrations
Designed and implemented DR and OR procedures
Used Spring Framework with Hibernate to map to Oracle database
Wrote shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions
Involved in configuring Hive and writing Hive UDFs
Worked on DevOps tools likeChef,Ansible, and Jenkins to configure and maintain the production environment
Worked on installing and configuring EC2 instances on Amazon Web Services (AWS) for establishing clusters on cloud
Hands-on experience of Sun One Application Server, Web logic Application Server, Web Sphere Application Server, Web Sphere Portal Server, and J2EE application deployment technology
Automation script to monitor HDFS and HBase through Cron jobs
Used MRUnit for debugging MapReduce that uses sequence files containing key value pairs.
Develop high-performance cache, making the site stable and improving its performance
Proficient with SQL languages and good understanding of Informatica
Administrative support for parallel computation research on a 24-node Fedora/ Linux cluster.

Environment: Hadoop, MapReduce, AWS EC2, HDFS, Chef, Jenkins, Hive, spark, Kafka, CouchDB, Flume, Cassandra, Hibernate, Oracle 11g, Java, Struts, Servlets, HTML, XML, SQL, J2EE, MRUnit, Informatica, JUnit, Tomcat 6, JDBC, JNDI, Maven, SQL, Oracle, XML, Eclipse.

Confidential, New York, NY

Hadoop Admin

Responsibilities:

Good understanding and related experience with Hadoop stack - internals, Hive, Pig and Map/Reduce.
Wrote MapReduce jobs to discover trends in data usage by users.
Involved in defining job flows.
Involved in managing and reviewing Hadoop log files.
Involved in running Hadoop streaming jobs to process terabytes of text data.
Load and transform large sets of structured, semi structured and unstructured data.
Extensive experience in testing, debugging and deploying MapReduc eHadoop platforms.
Involved in loading data from UNIX file system to HDFS.
Installation and configuration jobs ofHadoop MapReduce, HDFS, developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
Installed and configured Hive and also written Hive QL scripts.
Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
Extensive usage of Struts, HTML, CSS, JSP, JQuery, AJAX and JavaScript for interactive pages.
Developed workflows to process flume log data using Apache Spark inScala
Used Ganglia to monitor the cluster around the clock.
Assist the team in their development & deployment activities.
Instrumental in preparing TDD &developing Java Web-Services for WU applications for many of the money transfer functionalities.
Used Web services concepts like SOAP, WSDL, JAXB, and JAXP to interact with other project within Supreme Court for sharing information.
Involved in developing Database access components using Spring DAO integrated with Hibernate for accessing the data.
Involved in writing HQL queries, Criteria queries and SQL queries for the Data access layer.
Involved in managing deployments using xml scripts.
Testing - Unit testing through JUNIT & Integration testing in staging environment.
Followed Agile SCRUM principles in developing the project.
Involved in development of SQL Server Stored Procedures and SSIS DTSX Packages to automate regular mundane tasks as per business needs.
Coordinating with offshore/onshore, collaboration and arranging the weekly meeting to discuss and track the development progress.
Involved in coordinating for Unit Testing, Quality Assurance, User Acceptance Testing and Bug Fixing.
Coordination with team, peer reviews and collaborative System level testing.Worked hands on with ETL process. Handled importing data from various data sources, performed transformations.

Environment: Hadoop, MapReduce, HDFS, Pig, Hive, Java (jdk1.6), Java, HTML, JavaScript, XML, XSLT, jQuery, AJAX, Web Services, JNDI, SQL Server, Struts2.0, Hibernate.

Confidential

Java Developer

Responsibilities:

Involved in the elaboration, construction and transition phases of the Rational Unified Process.
Designed and developed necessary UML Diagrams like Use Case, Class, Sequence, State and Activity diagrams using IBM Rational Rose.
Used IBM Rational Application Developer (RAD) for development.
Extensively applied various design patterns such as MVC-2, Front Controller, Factory, Singleton, Business Delegate, Session Façade, Service Locator, DAO etc. throughout the application for a clear and manageable distribution of roles.
Implemented the project as a multi-tier application using Jakarta Struts Framework along with JSP for the presentation tier.
Used the Struts Validation Framework for validation and Struts Tiles Framework for reusable presentation components at the presentation tier.
Developed various Action Classes that route requests to appropriate handlers.
Developed Session Beans to process user requests and Entity Beans to load and store information from database.
Used JMS (MQSeries) for reliable and asynchronous messaging the different components.
Extensively work on Node.js, Angular.js etc
Wrote Stored Procedures and complicated queries for IBM DB2
Designed and used JUnit test cases during the development phase.
Extensively used log4j for logging throughout the application.
Used CVS for efficiently managing the source code versions with the development team.

Environment: JDK, J2EE, Web Services (SOAP, WSDL, JAX-WS), Hibernate, Spring, Servlets, JSP, Java Beans, NetBeans, Oracle SQL Developer, JUnit, Clover, CVS, Log4j, PL/SQL, Oracle, Web sphere Application Server, Tomcat Web Server, Node.js

Confidential

JAVA Developer

Responsibilities:

Involved in Design, Development and Support phases of Software Development Life Cycle (SDLC)
Reviewed the functional, design, source code and test specifications
Involved in developing the complete front end development using Java Script and CSS
Created real time web applications usingNode.js
Author for Functional, Design and Test Specifications
Implemented Backend, Configuration DAO, XML generation modules of DIS
Analyzed, designed and developed the component
Used JDBC for database access
Used Data Transfer Object (DTO) design patterns
Unit testing and rigorous integration testing of the whole application
Written and executed the Test Scripts using JUNIT
Actively involved in system testing
Developed XML parsing tool for regression testing
Prepared the Installation, Customer guide and Configuration document which were delivered to the customer along with the product.

Environment: Java, JavaScript, HTML, CSS, JDK 1.5.1, JDBC, JUnit, Node.js Oracle10g, XML, XSL and UML

We provide IT Staff Augmentation Services!

Hadoop Admin/developer Resume

Columbus, OH

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship