Hadoop Admin/Dev Resume

PROFESSIONAL SUMMARY:

US citizen, over 10+ years of IT experience in various domains like Finance, Banking and Insurance company with different technologies.
Including Big Data Hadoop with different distribution like Cloudera, Hortonworks and Apache in different platforms such as VMware and Clouds.
Excellent understanding of Hadoop architecture and underlying framework including storagemanagement in private Cloud and Public Cloud.
Knowledge of multiple distributions/platforms (Apache,Cloudera,Hortonworks).
Experienced in using various Hadoop ecosystems such as MapReduce,HBase, Pig, Hive, MongoDB,Sqoop,and Flume.
Worked with Pyspark, Scala Spark for in memory computing.
Knowledge in using job scheduling and monitoring tools like Oozie and Zookeeper.
Expert in using Sqoop for fetching data from different RDMS Database to analyze in HDFS.
Developed Map and Reduce codes as per the business requirements.
Very Good understanding of RDMS, Informatica ETL and Data Center Technologies.
Experienced in Java, Python, Ruby, XML, Scala,SQL, PL/SQL, and Shell Scripting.
Configured Hadoop cluster in private and public cloud.
Experienced with Virtualization technologies including Installed, Configured and administered VMware.
Ability to learn and adapt quickly to the emerging new technology paradigms.

TECHNICAL SKILLS:

Operating Systems: Windows Server 2012/2008/2005 , UNIX/Linux, IBM Mainframe

Ecosystem: Hive, HBase, Pig, MongoDB,Zookeeper,Oozie, Kafka, Sqoop, Flume and Apache Spark

Methodologies:: Agile/Scrum, Light Agile Development (LAD), Waterfall, Iterative

Database / DB Tools: Oracle, Microsoft SQL Server, MySQL, MongoDB

Languages: JavaScript, Python, Scala, Java, Ruby on Rails,SQL, XML

Cloud: AWS, vCenter Vmware, Private cloud, Public cloud, Hybrid cloud

Network Protocols: NFS, NTP, DNS, TCP/IP

Security: Active Directory, Kerberos, LDAP

Backup / Monitoring: Veeam, Splunk

PROFESSIONAL EXPERIENCE:

Hadoop Admin/Dev

Confidential

Responsibilities:

Worked on setting up high availability for major production cluster and designed automatic failoverClouderacluster.
Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
Developed MapReduce code for Hortonworks spark in Python and Scala.
Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre - process the data.
Prepared technical documentation of systems, processes and application logic for existing data sets.
Provided design recommendations and thought leadership to sponsors/stakeholders that improved review processes and resolved technical problems.
Managed and reviewed Clouderalog files.
Worked on Yarn MapReduce 2.0 in cluster environment for interactive querying and parallel batch processing.
Tested raw data and executed performance scripts.
Performed Analyzing/Transforming data with Hive and Pig.
Provided assistance for troubleshooting and resolution of problems relating to Hadoop jobs and custom applications.
Assisted in designing, development and architecture of Ecosystem and domain.
Participated in installation, updating and maintenance of Cloudera software applications.
Configured and maintained multi node cluster environment.
Created, cloned Linux Virtual Machines, templates using VMware Virtual Client.
Added SAN using multipath and creating physical volumes, volume groups, logical volumes.
RPM and YUM package installations, patch and other server management.
Automated processes for application of new technologies as per latest changes.
Configured access domains.Required for Device Group and Template administrators.
Configured Admin Role profiles.Required if you are assigning a custom role to the administrator.

Environment: Cloudera Manager,MapReduce, HDFS, Hive, Hbase, MongoDB, Java, Oracle, Pig, Sqoop, Oozie, Tableau. Apache Spark, Kafka, Spark R, MLlib, vCenter VMware, ESXi server.

Hadoop Admin/Dev

Confidential

Responsibilities:

Installed and configured various components of Hadoop ecosystem and maintained their integrity
Managed Hadoop clusters: setup, install, monitor, maintain.
Planned for production cluster hardware and software installation on production cluster and communicating with multiple teams to get it done.
Designed, configured and managed the backup and disaster recovery for HDFS data.
Commissioned DataNodes when data grew and decommissioned when the hardware degraded.
Migrated data across clusters using DISTCP.
Experience in creating shell scripts for detecting and alerting problems system.
Monitored multiple hadoop clusters environments using Ganglia and Nagios. Monitored workload, job performance and capacity planning using Ambari.
Performed data analytics in Hive and then exported this metrics back to Oracle Database using Sqoop.
Designed workflow by scheduling Hive processes for Log file data which is streamed into HDFS usingFlume and Kafka.
Conducting root cause analysis and resolve production problems and data issues.
Installed and configured Hive, Pig, Sqoop and Oozie on the HDP 2.2.0 cluster.
Involved in implementing High Availability and automatic failover infrastructure to overcome single point of failure for Namenode utilizing zookeeper services.
Implemented HDFS snapshot feature.
Performed a Major upgrade in production environment from HDP 1.3 to HDP 2.2.0
Worked with big data developers, designers and scientists in troubleshooting map reduce job failures and issues with Hive, Pig and Flume.
Configured custom interceptors in Flume agents for replicating and multiplexing data into multiple sinks.
Administered Tableau Server backing up the reports and providing privileges to users.
Worked on Tableau for generating reports on HDFS data.
Installed Ambari on existing Hadoop cluster.

Environment: Hadoop, MapReduce, HDFS, Hive, Hbase, MongoDB, Java, Oracle, Cloudera Manager, Pig, Sqoop, Oozie, Tableau.

ETL Developer

Confidential, Falls Church, VA

Responsibilities:

Modified and added edits in Informatica Mapping followed by SDLC methodology.
Created Pseudo code based on client requirements, prepared Decision Matrix, created Test Data, developed Code, and performed testing. If it pass the test then implemented in production.
Prepared Change Code Documents, Technical Peer Review, Functional Demonstration documents and all project related deliverables.
Created PIG and retrofitted in production.
Validated the Transmission Files and splitting valid files using AIX Shell Scripting and informatica mapping.
Created or modified of UNIX shell script and crontab file to maintain automation and execute specific process in cycle run.
Created Parameter in mapping for passing parameter values into mappings to meet the frequently changing business requirements.
Created reusable batch file to create catalog in DB2 database.
Loaded test data in TED STG database and performed Informatica Mappings to process the received records.
Wrote SQL queries to check test result data in TED ODS database.
Prepared Test Case and Test Data and executed the test cases during Unit, System and User Acceptance Testing.

Environment: Informatica Power Center 9.5, IBM DB2, MS-Visio, Windows XP/2008, AIX 5.3, Business Objects, ERWin, Advanced Query Tool, Toad, Quest Central.

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship