Hadoop Developer Resume
0/5 (Submit Your Rating)
Bentonville, AR
OBJECTIVE:
- To provide Hadoop development services for Big Data Implementations
TECHNICAL SKILLS:
Hadoop: Map Reduce, HDFS, Hive, Pig, Sqoop, Flume,Oozie, Zookeeper
System software: Linux, Windows XP, Server 2003, Server 2008
Network administration: TCP/IP fundamentals, wireless networks, LAN and WAN
Languages: C, JAVA, PYTHON, SQL, HQL, PIG LATIN, UNIX shell scripting
Database: MYSQL
PROFESSIONAL EXPERIENCE:
HADOOP DEVELOPER
Confidential, BENTONVILLE, AR
Responsibilities:
- Migrated the Oozie workflows to new version during the upgradation of Hadoop cluster from cdh3u1 to cdh4.1.2
- Developed the Oozie workflows for loading the full tables and partial tables in Hadoop using Sqoop, Hive
- Implemented the new household data fix in Hadoop tables using the data from Experian and Teradata tables
- Maintained the data in Teradata and Hadoop tables. Developed workflows to automate the row count differences between Hadoop and Teradata tables
- Developed Shell scripts to report the disk usage by users on Hadoop clusters and automate the data clean up activity
- Developed Oozie dashboard in Hue browser for business users to directly code the Oozie workflows as necessary
- Performed data analytics using Hive HQL
HADOOP DEVELOPER
Confidential, BENTONVILLE, AR
Responsibilities:
- Brought the data from the third party client called ‘Experian’
- Co - ordinated with Experian team and schedule the monthly feeds of CPO
- Developed the script to get the data and load into the Virtual Server from FTP to Hadoop using shell scripts
- Convert the flat text files into structured format with a delimiter ’ \001’ using HQL
- Did automation to run the scripts using Confidential proprietary tool called ‘Resource Manager’(Crontab)
- Developed Oozie workflows and automated the workflow to import the data from Teradata into Hadoop using sqoop
- Co-ordinated with QA team for the SIT defects
- Created deployment plan for Release1 and helped RM team with the initial deployments
- Helped UAT meeting for the technical co-ordination
HADOOP CONSULTANT
Confidential, BOSTON, MA
Responsibilities:
- Confidential is the advertisement networking company that manages that links up advertisers with the online advertisement inventory. Large amount of high volume data from bidders and other networks was streamed daily to Google Storage platform. From Google Storage platform, the data was loaded into Hadoop Cluster. This data was maintained for 60 days for analysis purposes. GB of data was loaded daily into the cluster. The specific tasks involved were
- The data was imported from bidders loggers into Hadoop Clusters via Google Storage
- Data was pre-processed and fact tables were created using HIVE
- The resulting data set was exported to Mysql for further analysis
- Generated reports using Pentaho report designer
- Automated all the jobs from pulling data from Google Storage to loading data into Mysql using shell scripts
Hadoop Engineer
Confidential, SAN DIEGO, CA
Responsibilities:
- Participated in Implementation of Hadoop POC’s for clients. The responsibilities include development of Map reduce programs and installation of Hadoop Clusters
- Automated installation through PUPPET spanning multiple racks and assisted in data loading, querying and data extraction
- Developed Map Reduce program to perform data analysis
- Experienced in Hadoop Development:
- Developing Map Reduce programs to perform analysis
- Experienced in analyzing data with Hive and Pig
- Experienced in defining job flows
- Jobs management using Fair scheduler
- Cluster coordination services through Zoo Keeper
- Importing and exporting data into HDFS and Hive using Sqoop
- Loading log data directly into HDFS using Flume
- Knowledge on ETL technologies, Machine Learning tools and Data mining
- Knowledge on Java Virtual Machines (JVM) and Multithreaded processing
- Experienced in Hadoop Administration:
- Installation, Configuration and Management of Hadoop Cluster using Puppet
- DRBD implementation of Name Node Replication to avoid single point of failure
- Managing the configuration of the cluster to the meet the needs of analysis whether I/O bound or CPU bound
- Experienced in managing and reviewing Hadoop Log files
- 24x7 Remote Management through Nagios and Ganglia
- Experienced in supporting data analysis projects using Elastic Map Reduce on the Amazon Web Services (AWS) cloud
- Exported and imported data into S3
- Experienced in defining the Job flows
- Experienced in analyzing data with Hive, Pig and Hadoop Streaming
Mysql database ADMINISTRATOR
Confidential, San Francisco, CA
Responsibilities:
- Provided system support for MySQL servers as part of 24x7 teams
- Performed Linux administration functions like installation, configuration, upgrading and ongoing management using automated tools
- Systems to monitor via Nagios (NRPE) and report on system performance and utilization of MySQL database systems
- Completed analysis of client, server, and infrastructure performance
- Designing databases and tuning queries for MySQL
- Query review and index optimizations
Linux System admin
Confidential
Responsibilities:
- Managed the Linux servers at Hosted environment
- Provided monitoring, backup and other systems related tasks
- Installing and maintaining the Linux servers.
- Installed Linux using Pre-Execution environment boot and Kick-start method on multiple servers.
- Data Sharing and backup through NFS.
- Monitoring System Metrics and logs for any problems.
- Adding, removing, or updatinguser accountinformation, resettingpasswords, etc