Sr. Lead Hadoop Infrastructure Architect / Administrator Resume
New York City, NY
SUMMARY:
- Having 10+ Years of IT experience in the field of Distributed Systems Architecture & Design.
- Automation Experience with Python & Shell.
- Efficient in Debugging S/W & H/W Problems.
- Proficient in LINUX Operating System.
- Exceptional in Requirement Gathering, Capacity Planning & Forecast Analysis.
- Good at Systems Operations such as Performance Tuning, JVMs & Dump Analysis
- Mentoring Team on new Technologies/Tools.
TECHNICAL SKILLS:
GroupWare: Big Data Ecosystem, LINUX, Scripting Languages, MySQL, Oracle, IBM i, AWS, Microsoft Azure, Isilon, CDH.
Tools: Map Reduce, HBase, Spark, Impala, HIVE, Pig, Oozie, Storm, Cloudera Manager, Nagios, Ganglia, YARN, Ambari, Kafka, Solr, Docker, Kubernetes, R, Anaconda Python.
Job Functions: Design & Administration of HADOOP Cluster, Proof of Concepts in Big Data Analytics and Project Management.
PROFESSIONAL EXPERIENCE:
Confidential, New York City, NY
Sr. Lead Hadoop Infrastructure Architect / Administrator
Responsibilities:
- Architecting & Designing New Clusters for Big Data workloads with Hadoop Ecosystem.
- Implementing & Administering large Hadoop environments for Regulatory processing needs.
- Performed Several Installations and Upgrades on Hadoop Infrastructure.
- Deploying Kerberos for Strong Authentication to interact/use with Hadoop Clusters.
- Setting up RBAC model for Fine Grained Authorization on Data via tools such as Sentry, Ranger.
- Configuring End to End Encryption by Covering In - Motion and At Rest entry and exit Points.
- Enabling High Availability across all Services to support 24x7 application on Hadoop Platform.
- SME in Spark Based Application Performance Tuning.
- Hardening the Cluster(s) to support Mission Critical & Highly available applications.
- Developed Templates for Big Data Use Case & Capacity Requirement Gathering/Analysis.
- Writing & Updating Best-Practices document on Hadoop Ecosystem Components.
- Optimizing & Finding Efficient ways for Resource utilization both on App(s). and on Infra.
- Developed a Unique Smoke-Test Oozie Workflow for Hadoop Cluster Validation.
- To use in Pre and Post Maintenance, also to Pro-Actively Monitor Various Big Data Eco System components with simple Command.
- Developed Mini Utilities to Retrieve metrics/reports from Hadoop Cluster.
- Moving the Services (Re-distribution) from one Host to another host with in the Cluster to facilitate Security, Load and High Availability of Hadoop Service(s).
- Experience in Migrating Big Data Workloads/Clusters into Cloud.
- Setting up & Integrating Kafka with existing Hadoop environments for large volumes of events/logs ingestion and further processing on Hadoop.
- Customizing alerts from Cloudera Manager to classify alerts based on Severity.
- Managing & Tuning Solr to Support a large Solr based Application.
- Taking care of Data Governance on Hadoop Cluster through Navigator and other external tools.
- Moving Hadoop Services Databases from PostgreSQL to MySQL database.
- Using ITIL Practices in Big Data Operations Management.
- Configuring Hybrid Cluster that span over Bare Metal H/W and on Cloud to meet bursting needs of applications.
- Solid Understanding on JVM - Tuning and Troubleshooting.
- Mentoring Team members in Installation, Configuration & Automation tasks.
- Supporting Data Science Initiatives & Teams in using new Tools and Methodologies.
- Efficient in Troubleshooting and Identifying Root-Cause in Distributed Platform like Hadoop.
- Engineered APM tools like Unravel & Pepper data to increase efficiency of Apps. on Bigdata
- Looking out for ways on Platform in Pro-Active Monitoring rather than react to Alerts/Situation.
- Better Understanding and Supporting on Network Infrastructure such as Firewalls of Hadoop.
- Facilitating Downstream Applications in data extract process for Reporting & Visualization.
- Managing and Reviewing Hadoop Log files for retention & Auditing/Accounting purpose.
- Configuring and Managing Hive UDFs/Extensions.
- Operationalized Container based Hadoop Cluster using Blue Data Product.
- Experienced in Supporting Large Scale Data processing & Explorative Analytics on Hadoop.
- Efficient in handling Hadoop clusters account management activates both in Linux & AD Env.
- Managing and Supporting Microsoft R and Anaconda Python like installing packages, libraries.
- Performing PoCs and Demonstration to Management/Stakeholders.
- Knowledge of GIT, JIRA and Jenkins tools functionality.
- Hands on Experience in Container Based Technologies such as Dockers & Kubernetes.
- Being Point of contact for all Vendor relations & Coordination.
- Setting up and Ensuring DR Sites are up to date with Production on data & configuration.
- Strong working experience with open source technology
- Store unstructured data in semi structure format on HDFS using HBase.
- Implemented partitioning and bucketing in HIVE
Confidential
Sr. Hadoop Administrator
Responsibilities:
- Hadoop Clusters Architecture & Implementation for Bigdata Workloads.
- Installing and Upgrading Cloudera CDH & Hortonworks HDP Versions through Cloudera Manager and Ambari Management tools.
- Hardening the Cluster(s) to support mission critical & Highly available applications.
- Building scalable cluster in Cloud (AWS)
- Moving the Services (Re-distribution) from one Host to another host with in the Cluster to facilitate securing the cluster and ensuring High availability of the services.
- Implementing Security on Hadoop Cluster by Kerberos/LDAP and by encrypting the data in motion & at the REST.
- Used SPARK to build fast analytics for ETL Process and Constructed ingest pipeline using Spark streaming.
- Setting up Kafka (Confluent & Cloudera) for ingesting large volumes of events/logs into Hadoop.
- Taking care of Data Governance on Hadoop Cluster through Navigator and other external tools.
- Transforming data from RDBMS to HDFS using available methodologies.
- Moving Databases of Services from PostgreSQL to MySQL database.
- Identifying the best solutions/ Proof of Concept leveraging Big Data & Advanced Analytics that meet and exceed the customer's business, functional and technical requirements.
- Strong working experience with open source technology
- Experience in addressing scale and performance problems
- Store unstructured data in semi structure format on HDFS using HBase.
- Used Change management and Incident management process following organization guidelines.
- Implemented partitioning and bucketing in HIVE
- Knowledge of java virtual machines (JVM) and multithreaded processing
- Strong troubleshooting and performance tuning skills in Hadoop cluster & network administration.
- Continuous monitoring and managing the HADOOP cluster through Cloudera Manager
- Strong network background with a good understanding of TCP/IP, firewalls and DNS.
- Demonstration of Proof of Concept Demo to Management/Stakeholders.
- Knowledge of Git, JIRA and Jenkins tools functionality.
- Mentoring & supporting team members in managing Hadoop clusters.
- Reviewing and updating all the configuration & documentation of Hadoop clusters as part of Continuous improvement processes
- Handled importing of data from various data sources using Sqoop & Flume.
- Performed data completeness, correctness, data transformation and data quality testing using available tools & techniques.
- Facilitating upstream applications in data extract process for Reporting & Visualization.
- Experience in providing support to Data Science team in exploring new tools and methodologies.
- Managing and reviewing HADOOP log files for retention & auditing/accounting purpose.
- Setting up databases in Hive, Creation of tables and eventually to load data from downstream apps.
- Installing and configuring Hive and supporting custom Hive UDFs/Extensions.
- Experience in large-scale data processing either in-house or in Cloud environment.
- Efficient in handling Hadoop clusters account management activates both in Linux & AD Env.
- Supported technical team members for automation, installation and configuration tasks.
- Wrote shell scripts to monitor the health check of HADOOP daemon services and respond accordingly to any warning or failure conditions.
Confidential, Peoria, IL
Sr. Hadoop Administrator
Responsibilities:
- Develop, document and revise system/cluster design procedures, test procedures, and quality standards.
- Aligning with Development team(s) to deploy code and to launch new Big Data applications.
- Test, maintain, and monitor computer programs and systems, including coordinating the installation of computer programs and systems.
- Helping end users and developers access Hadoop cluster without any issues.
- Providing ongoing support for Hadoop platform.
- Expand or modify system to serve new purposes or improve work flow as part of continuous improvement plan for existing applications and systems.
- Being Point of Contact for Vendor escalation
- Architecting & implementing new Hadoop clusters with appropriate Security protocols.
- Pro-active monitoring on Hadoop components and on underlying hardware
- Automating repetitive manual tasks using scripting languages like Shell, Python etc.
- Upgrading Hadoop Cluster version time to time to improve functionalities and fix issues in
- Hadoop base code.
- Migrating Services from one system to another system with in the cluster
- Implementing security using various protocols such as Kerberos, LDAP, Encryption in
- Motion and REST models
- Performance tuning at cluster level and to some extent into applications as well.
- Taking care of on-going support of existing clusters like any connectivity problems to and from Hadoop, security related issues,account & access management,performance issues with Jobs or Queries.
- Hardening the Hadoop cluster to support mission critical and highly available applications andany other issues that are hampering applications progress with in Hadoop platform.
- Architecting and Designing new clusters for Hadoop Ecosystem.
- Installing and Upgrading Cloudera CDH & Horton works HDP Versions through Cloudera Manager and Ambari Management tools.
- Moving the Services (Re-distribution) from one Host to another host with in the Cluster to facilitate securing the cluster and ensuring High availability of the services.
- Implementing Security on Hadoop Cluster by Kerberos/LDAP and by encrypting the data in motion & at the REST.
- Used SPARK to build fast analytics for ETL Process and Constructed ingest pipeline using Spark streaming.
- Taking care of Data Governance on Hadoop Cluster through Navigator and other external tools.
- Transforming data from RDBMS to HDFS using available methodologies.
- Moving Databases of Services from PostgreSQL to MySQL database.
- Identifying the best solutions/ Proof of Concept leveraging Big Data & Advanced Analytics that meet and exceed the customer's business, functional and technical requirements.
- Strong working experience with open source technology
- Experience in addressing scale and performance problems
- Store unstructured data in semi structure format on HDFS using HBase.
- Used Change management and Incident management process following organization guidelines.
- Implemented partitioning and bucketing in HIVE
- Knowledge of java virtual machines (JVM) and multithreaded processing
- Strong troubleshooting and performance tuning skills in Hadoop cluster & network administration.
- Continuous monitoring and managing the HADOOP cluster through Cloudera Manager
- Strong network background with a good understanding of TCP/IP, firewalls and DNS.
- Demonstration of Proof of Concept Demo to Management/Stakeholders.
- Knowledge of Git, JIRA and Jenkins tools functionality.
- Mentoring & supporting team members in managing Hadoop clusters.
- Reviewing and updating all the configuration & documentation of Hadoop clusters as part of Continuous improvement processes
- Handled importing of data from various data sources using Sqoop & Flume.
- Performed data completeness, correctness, data transformation and data quality testing using available tools & techniques.
- Facilitating upstream applications in data extract process for Reporting & Visualization.
- Experience in providing support to Data Science team in exploring new tools and methodologies.
- Managing and reviewing HADOOP log files for retention & auditing/accounting purpose.
- Setting up databases in Hive, Creation of tables and eventually to load data from downstream apps.
- Installing and configuring Hive and supporting custom Hive UDFs/Extensions.
- Experience in large-scale data processing either in-house or in Cloud environment.
- 1 Efficient in handling Hadoop clusters account management activates both in Linux & AD Env.
- Supported technical team members for automation, installation and configuration tasks.
- Wrote shell scripts to monitor the health check of HADOOP daemon services and respond accordingly to any warning or failure conditions.
Confidential, Atlanta, GA
Big Data Operations Engineer
Responsibilities:
- Installation and configuration of HADOOP1.0 & 2.0 Cluster and Maintenance of it through Cluster Monitoring & Troubleshooting.
- Storing the data that is coming from “Genomic Research, Labs, Sensors and EHRs” into Hadoop cluster for further processing to make better sense of the data.
- Transforming data from RDBMS to HDFS using available methodologies.
- Identified the best solutions/ Proof of Concept leveraging Big Data & Advanced Analytics levers that meet and exceed the customer's business, functional and technical requirements.
- Installed Cloudera Manager Server and configured the database for Cloudera Manager Server.
- Store unstructured data in semi structure format on HDFS using Hbase.
- Used Change management and Incident management process following the company standards.
- Implemented partitioning, dynamic partitions and buckets in HIVE
- Knowledge of java virtual machines (JVM) and multithreaded processing
- Continuous monitoring and managing the HADOOP cluster through Cloudera Manager
- Strong network background with a good understanding of TCP/IP, firewalls and DNS.
- Demonstration of the Live Proof of Concept Demo to Clients
- Supported technical team members in management and review of HADOOP log files and data backups.
- Continuous improvement processes for all process automation scripts and tasks.
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop
- Performed data completeness, correctness, data transformation and data quality testing using SQL.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Experience in providing support to data analyst in running Pig and Hive queries.
- Managing and reviewing HADOOP log files.
- Creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way
- Installing and configuring Hive and also written Hive UDFs
- Experience in large-scale data processing, on an Amazon EMR cluster
- Supported technical team members for automation, installation and configuration tasks.
- Wrote shell scripts to monitor the health check of HADOOP daemon services and respond accordingly to any warning or failure conditions
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
Confidential, Tysons Corner, VA
Linux Engineer
Responsibilities:
- Build servers using Kick Start, Red Hat Satellite Server, and vSphere Client
- Applying patches to keep servers updated against bugs present in the operating system using Red Hat Satellite Server, yum, etc.
- Working with diverse groups to Support in-house application requirements.
- User Profile Management, Group management and administrative delegations
- Provide support to Account Managers, UNIX and Windows technicians, and other departments
- Worked exclusively on VMware virtual environment.
- Experience in using VMware Veeam to move VM's from One Datacenter to another datacenter.
- Involved in installation and configuration of various Third-party software onto servers.
- Involved in Oracle/SQL Upgrade project which includes various LINUX builds of different OS platforms across various data centers.
- Co-ordinated with various cross functional teams across IT operations to make sure smooth functioning of projects.
- Worked closely with DBA Team in order to adjust kernel parameters as per requirements.
- Installed, configured and provided support for Tivoli Monitoring software across various OS platforms like RHEL, AIX and Solaris.
- Installed packages using YUM and Red hat Package Manager (RPM) on various servers
- Day to day resolution on Linux based issued though SMS ticketing system in compliance to SLA
- Automating many days to day tasks through Bash scripting.
- Worked with Red Hat Satellite Server which is used to push changes across various servers simultaneously.
- Performed the daily system administration tasks like managing system resources and end users support operations and security.
Confidential, Norwalk, CT
Sr. iSeries Subject Matter Expert SME
Responsibilities:
- Performed IBM i OS Upgrade/Installation and DR Tests with V5R4, V6R1 & V7R1.
- New Server Build As per the Business Need in a Multi paced Environment.
- Managing LPARs through HMC and Planning for IPL as and when required.
- Storage Management on both Internal Disks and External LUNs (SAN).
- Managed ATL and VTL tape Environments in a Big scale.
- Managing User Profiles and their authority levels through Authorization Lists.
- Applying PTFs & TRs based on system needs to meet business capabilities.
- Controlling the Group Profiles to manage Existing and New user’s needs.
- Providing Secure Environments to the Business using Functions like SSL, SSH etc. & Ensuring the same through random Checks and through various Audit Models
- Implementing Capacity Management to Provide Inputs for Future Enhancements/Up gradation in Existing Configuration Setup to meet Growing Business Needs.
- Troubleshooting all Escalated problems which are pertaining to AS400 Production Env.
- Instrumental in Preparing Periodical Service Level reports & reviewing them with Client.
- Mentoring team to handle client calls and requests.
- Handling situations of Job overrunning/loops and consuming more CPU as well as ASP etc.
- Responsible for running Bridge Calls & driving them to resolve Production Critical issues
- Performed H/W Migration from one model to another model.
- Managing Job Scheduling Tasks using Native and ROBOT Scheduling Utilities.
- Prioritizing Jobs from Nightly Batch Cycle as per customer requirement
- Initiating RCAs/PB Tickets on Recurring problems and driving them to the closure by implementing a Permanent Solution.
- Ensuring the object level authority of data on Production Systems.
- Initiating Service Improvement Plan on Backup, Housekeeping, and Monitoring & Restore etc. for Optimized Performance and to fulfill Customer needs.
- SLA based Service Delivery & maintaining Quality of Technical Service.
- SPOC for iSeries Product Line to provide any kind of Information for Customer.
- Aligning the Process with ITIL standards to achieve better results and Customer Satisfaction.
- Reviewing Weekend Changes and supporting the team for successful implementation.
- Maintaining SoX regulated Environment in Compliance with the Standards.
- Supporting MQ Series on Multiple AS/400 environment.
- Providing ah-hoc support for ERP application hosted on AS/400 Platform.
Confidential, Washington, DC
AS/400 Administrator
Responsibilities:
- Experience in AS/400 system OS/400 Upgrade & Disaster Recovery Tests.
- Managing LPARs through HMC and Performing IPL as and when required.
- Storage Management on both Internal Disks and External LUNs (SAN).
- Managed ATL and VTL tape Environments in Big scale.
- Managing User Profiles and their authority levels through Authorization Lists.
- Controlling the Group Profiles to manage Existing and New users.
- Granting authority to users/objects on Client’s requests
- Monitoring & Controlling Jobs Flow Cycle, Implementing SSL in Existing Environment.
- Troubleshooting all the problems which pertaining to AS400 in its Production Env.
- PTF Installations on Scheduled Intervals.
- Helping monitoring team to handle client calls and requests.
- Handling situations when Job overrunning, Job goes into the loop and consuming more CPU as well as ASP.
- Responsible for running Bridge Calls and driving them to resolve the Production issues
- Knowledge of H/W Migration from one model to another model.
- Working with ROBOT/SCHEDULER and ROBOT/SAVE Products.
- Prioritizing Jobs from Nightly Batch Cycle as per customer requirement
- Perform End of the Day (EOD) & Monitoring the System, Backup/Restoration of libraries/objects as per User requests.
- Ensuring the object level authority of the data on Production Systems.
- Performing Security Audits on Production systems on Quarterly Basis.
- SLA based Service Delivery & maintaining Quality of Technical Service.
- SPOC for analysis and resolving of problems related with AS/400 Servers, operating system.
- Creating, Changing Control Groups, and Policies accordingly in BRMS.
- Restoring the data application data as requested.
- Performing Virtual Role Swap with and Without Communications
- Syncing New data with Target node using replication tool like MIMIX.
- Managing/Monitoring journals Behavior in HA environment.
- Applying iTERA fixes upon their availability.
- Monitoring and Error Fixing which comes in Data Groups
- Managing DG to add ad hoc Lib to existing DG.
- Make new data sync with Target system.
- Applying MIMIX Patches and Upgrading to next available releases.