We provide IT Staff Augmentation Services!

Hadoop Admin/dev Resume

4.00/5 (Submit Your Rating)

PROFESSIONAL SUMMARY:

  • US citizen, over 10+ years of IT experience in various domains like Finance, Banking and Insurance company with different technologies.
  • Including Big Data Hadoop with different distribution like Cloudera, Hortonworks and Apache in different platforms such as VMware and Clouds.
  • Excellent understanding of Hadoop architecture and underlying framework including storagemanagement in private Cloud and Public Cloud.
  • Knowledge of multiple distributions/platforms (Apache,Cloudera,Hortonworks).
  • Experienced in using various Hadoop ecosystems such as MapReduce,HBase, Pig, Hive, MongoDB,Sqoop,and Flume.
  • Worked with Pyspark, Scala Spark for in memory computing.
  • Knowledge in using job scheduling and monitoring tools like Oozie and Zookeeper.
  • Expert in using Sqoop for fetching data from different RDMS Database to analyze in HDFS.
  • Developed Map and Reduce codes as per the business requirements.
  • Very Good understanding of RDMS, Informatica ETL and Data Center Technologies.
  • Experienced in Java, Python, Ruby, XML, Scala,SQL, PL/SQL, and Shell Scripting.
  • Configured Hadoop cluster in private and public cloud.
  • Experienced with Virtualization technologies including Installed, Configured and administered VMware.
  • Ability to learn and adapt quickly to the emerging new technology paradigms.

TECHNICAL SKILLS:

Operating Systems: Windows Server 2012/2008/2005 , UNIX/Linux, IBM Mainframe

Ecosystem: Hive, HBase, Pig, MongoDB,Zookeeper,Oozie, Kafka, Sqoop, Flume and Apache Spark

Methodologies:: Agile/Scrum, Light Agile Development (LAD), Waterfall, Iterative

Database / DB Tools: Oracle, Microsoft SQL Server, MySQL, MongoDB

Languages: JavaScript, Python, Scala, Java, Ruby on Rails,SQL, XML

Cloud: AWS, vCenter Vmware, Private cloud, Public cloud, Hybrid cloud

Network Protocols: NFS, NTP, DNS, TCP/IP

Security: Active Directory, Kerberos, LDAP

Backup / Monitoring: Veeam, Splunk

PROFESSIONAL EXPERIENCE:

Hadoop Admin/Dev

Confidential

Responsibilities:

  • Worked on setting up high availability for major production cluster and designed automatic failoverClouderacluster.
  • Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
  • Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
  • Developed MapReduce code for Hortonworks spark in Python and Scala.
  • Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre - process the data.
  • Prepared technical documentation of systems, processes and application logic for existing data sets.
  • Provided design recommendations and thought leadership to sponsors/stakeholders that improved review processes and resolved technical problems.
  • Managed and reviewed Clouderalog files.
  • Worked on Yarn MapReduce 2.0 in cluster environment for interactive querying and parallel batch processing.
  • Tested raw data and executed performance scripts.
  • Performed Analyzing/Transforming data with Hive and Pig.
  • Provided assistance for troubleshooting and resolution of problems relating to Hadoop jobs and custom applications.
  • Assisted in designing, development and architecture of Ecosystem and domain.
  • Participated in installation, updating and maintenance of Cloudera software applications.
  • Configured and maintained multi node cluster environment.
  • Created, cloned Linux Virtual Machines, templates using VMware Virtual Client.
  • Added SAN using multipath and creating physical volumes, volume groups, logical volumes.
  • RPM and YUM package installations, patch and other server management.
  • Automated processes for application of new technologies as per latest changes.
  • Configured access domains.Required for Device Group and Template administrators.
  • Configured Admin Role profiles.Required if you are assigning a custom role to the administrator.

Environment: Cloudera Manager,MapReduce, HDFS, Hive, Hbase, MongoDB, Java, Oracle, Pig, Sqoop, Oozie, Tableau. Apache Spark, Kafka, Spark R, MLlib, vCenter VMware, ESXi server.

Hadoop Admin/Dev

Confidential

Responsibilities:

  • Installed and configured various components of Hadoop ecosystem and maintained their integrity
  • Managed Hadoop clusters: setup, install, monitor, maintain.
  • Planned for production cluster hardware and software installation on production cluster and communicating with multiple teams to get it done.
  • Designed, configured and managed the backup and disaster recovery for HDFS data.
  • Commissioned DataNodes when data grew and decommissioned when the hardware degraded.
  • Migrated data across clusters using DISTCP.
  • Experience in creating shell scripts for detecting and alerting problems system.
  • Monitored multiple hadoop clusters environments using Ganglia and Nagios. Monitored workload, job performance and capacity planning using Ambari.
  • Performed data analytics in Hive and then exported this metrics back to Oracle Database using Sqoop.
  • Designed workflow by scheduling Hive processes for Log file data which is streamed into HDFS usingFlume and Kafka.
  • Conducting root cause analysis and resolve production problems and data issues.
  • Installed and configured Hive, Pig, Sqoop and Oozie on the HDP 2.2.0 cluster.
  • Involved in implementing High Availability and automatic failover infrastructure to overcome single point of failure for Namenode utilizing zookeeper services.
  • Implemented HDFS snapshot feature.
  • Performed a Major upgrade in production environment from HDP 1.3 to HDP 2.2.0
  • Worked with big data developers, designers and scientists in troubleshooting map reduce job failures and issues with Hive, Pig and Flume.
  • Configured custom interceptors in Flume agents for replicating and multiplexing data into multiple sinks.
  • Administered Tableau Server backing up the reports and providing privileges to users.
  • Worked on Tableau for generating reports on HDFS data.
  • Installed Ambari on existing Hadoop cluster.

Environment: Hadoop, MapReduce, HDFS, Hive, Hbase, MongoDB, Java, Oracle, Cloudera Manager, Pig, Sqoop, Oozie, Tableau.

ETL Developer

Confidential, Falls Church, VA

Responsibilities:

  • Modified and added edits in Informatica Mapping followed by SDLC methodology.
  • Created Pseudo code based on client requirements, prepared Decision Matrix, created Test Data, developed Code, and performed testing. If it pass the test then implemented in production.
  • Prepared Change Code Documents, Technical Peer Review, Functional Demonstration documents and all project related deliverables.
  • Created PIG and retrofitted in production.
  • Validated the Transmission Files and splitting valid files using AIX Shell Scripting and informatica mapping.
  • Created or modified of UNIX shell script and crontab file to maintain automation and execute specific process in cycle run.
  • Created Parameter in mapping for passing parameter values into mappings to meet the frequently changing business requirements.
  • Created reusable batch file to create catalog in DB2 database.
  • Loaded test data in TED STG database and performed Informatica Mappings to process the received records.
  • Wrote SQL queries to check test result data in TED ODS database.
  • Prepared Test Case and Test Data and executed the test cases during Unit, System and User Acceptance Testing.

Environment: Informatica Power Center 9.5, IBM DB2, MS-Visio, Windows XP/2008, AIX 5.3, Business Objects, ERWin, Advanced Query Tool, Toad, Quest Central.

We'd love your feedback!