HADOOP DEVELOPER Resume BENTONVILLE, AR - Hire IT People

OBJECTIVE:

TECHNICAL SKILLS:

Hadoop: Map Reduce, HDFS, Hive, Pig, Sqoop, Flume,Oozie, Zookeeper

System software: Linux, Windows XP, Server 2003, Server 2008

Network administration: TCP/IP fundamentals, wireless networks, LAN and WAN

Languages: C, JAVA, PYTHON, SQL, HQL, PIG LATIN, UNIX shell scripting

Database: MYSQL

PROFESSIONAL EXPERIENCE:

HADOOP DEVELOPER

Confidential, BENTONVILLE, AR

Responsibilities:

Migrated the Oozie workflows to new version during the upgradation of Hadoop cluster from cdh3u1 to cdh4.1.2
Developed the Oozie workflows for loading the full tables and partial tables in Hadoop using Sqoop, Hive
Implemented the new household data fix in Hadoop tables using the data from Experian and Teradata tables
Maintained the data in Teradata and Hadoop tables. Developed workflows to automate the row count differences between Hadoop and Teradata tables
Developed Shell scripts to report the disk usage by users on Hadoop clusters and automate the data clean up activity
Developed Oozie dashboard in Hue browser for business users to directly code the Oozie workflows as necessary
Performed data analytics using Hive HQL

HADOOP DEVELOPER

Confidential, BENTONVILLE, AR

Responsibilities:

Brought the data from the third party client called ‘Experian’
Co - ordinated with Experian team and schedule the monthly feeds of CPO
Developed the script to get the data and load into the Virtual Server from FTP to Hadoop using shell scripts
Convert the flat text files into structured format with a delimiter ’ \001’ using HQL
Did automation to run the scripts using Confidential proprietary tool called ‘Resource Manager’(Crontab)
Developed Oozie workflows and automated the workflow to import the data from Teradata into Hadoop using sqoop
Co-ordinated with QA team for the SIT defects
Created deployment plan for Release1 and helped RM team with the initial deployments
Helped UAT meeting for the technical co-ordination

HADOOP CONSULTANT

Confidential, BOSTON, MA

Responsibilities:

Confidential is the advertisement networking company that manages that links up advertisers with the online advertisement inventory. Large amount of high volume data from bidders and other networks was streamed daily to Google Storage platform. From Google Storage platform, the data was loaded into Hadoop Cluster. This data was maintained for 60 days for analysis purposes. GB of data was loaded daily into the cluster. The specific tasks involved were
The data was imported from bidders loggers into Hadoop Clusters via Google Storage
Data was pre-processed and fact tables were created using HIVE
The resulting data set was exported to Mysql for further analysis
Generated reports using Pentaho report designer
Automated all the jobs from pulling data from Google Storage to loading data into Mysql using shell scripts

Hadoop Engineer

Confidential, SAN DIEGO, CA

Responsibilities:

Participated in Implementation of Hadoop POC’s for clients. The responsibilities include development of Map reduce programs and installation of Hadoop Clusters
Automated installation through PUPPET spanning multiple racks and assisted in data loading, querying and data extraction
Developed Map Reduce program to perform data analysis
Experienced in Hadoop Development:
Developing Map Reduce programs to perform analysis
Experienced in analyzing data with Hive and Pig
Experienced in defining job flows
Jobs management using Fair scheduler
Cluster coordination services through Zoo Keeper
Importing and exporting data into HDFS and Hive using Sqoop
Loading log data directly into HDFS using Flume
Knowledge on ETL technologies, Machine Learning tools and Data mining
Knowledge on Java Virtual Machines (JVM) and Multithreaded processing
Experienced in Hadoop Administration:
Installation, Configuration and Management of Hadoop Cluster using Puppet
DRBD implementation of Name Node Replication to avoid single point of failure
Managing the configuration of the cluster to the meet the needs of analysis whether I/O bound or CPU bound
Experienced in managing and reviewing Hadoop Log files
24x7 Remote Management through Nagios and Ganglia
Experienced in supporting data analysis projects using Elastic Map Reduce on the Amazon Web Services (AWS) cloud
Exported and imported data into S3
Experienced in defining the Job flows
Experienced in analyzing data with Hive, Pig and Hadoop Streaming

Mysql database ADMINISTRATOR

Confidential, San Francisco, CA

Responsibilities:

Provided system support for MySQL servers as part of 24x7 teams
Performed Linux administration functions like installation, configuration, upgrading and ongoing management using automated tools
Systems to monitor via Nagios (NRPE) and report on system performance and utilization of MySQL database systems
Completed analysis of client, server, and infrastructure performance
Designing databases and tuning queries for MySQL
Query review and index optimizations

Linux System admin

Confidential

Responsibilities:

Managed the Linux servers at Hosted environment
Provided monitoring, backup and other systems related tasks
Installing and maintaining the Linux servers.
Installed Linux using Pre-Execution environment boot and Kick-start method on multiple servers.
Data Sharing and backup through NFS.
Monitoring System Metrics and logs for any problems.
Adding, removing, or updatinguser accountinformation, resettingpasswords, etc