Hadoop Architect Resume
Atlanta, GA
SUMMARY
- Extensive IT experience of 15+ years in Business Analysis, Design, Data Modeling, Development and Implementation with focus on Critical areas of Java, Big Data, ETL and Database Applications
- 5+ years of experience in Apache Hadoop Architecture managed critical deliveries from multiple vendors across geographies
- Always involved in all the critical issue related discussions to conclude all projects in on - time
- Good Experience in Java Core Applications, Database Streamlining and Middleware Systems in SDLC waterfall and Agile
- Experience in dealing with Apache Hadoop components like HDFS, MapReduce, HIVE, Hbase, Pig, Sqoop, Oozie, Flume Big Data and supporting Big Data Analytics
- In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS core system, Name Node, Job Tracker, Data Node, Task Tracker and Map Reduce concepts
- Experience in installation, configuration, support and management of a Hadoop Clusters in Apache, Cloudera, Amazon
- Experience in coordinating the java program data analyze using HiveQL, Pig Latin and custom Map Reduce
- Understanding, analyzing, managing and reviewing Hadoop, HDFS, Job Tracker, Task Tracker Log files
- Worked with Sqoop to move (import/export) data from a relational database into Hadoop and used FLUME to collect data and populate Hadoop
- Worked with HBase to conduct quick look ups (updates, inserts and deletes) in Hadoop
- Experience in working with cloud infrastructure like Amazon Web Services (AWS)
- Experience in understanding the client's Big Data business requirements and transform it into Hadoop centric technologies
- Experience in end to endDevOps
- Proficient inshellscripting andPython.
- Involved in cluster capacity planning, Hardware planning, Installation, Performance tuning of theHadoopcluster
- Defining job flows in Hadoop environment using tools like Oozie for data scrubbing and processing
- Experience in configuring Zookeeper to provide Cluster coordination services
- Experience in benchmarking, performing backup and recovery of Namenode metadata and data residing cluster
- Familiar in commissioning and decommissioning of nodes on Hadoop Cluster. Adapt at configuring Namenode High Availability
- Worked on Disaster Management with Hadoop Cluster
- Experience in deploying and managing the multi-node development, testing and production
- Experience in understanding the security requirements for Hadoop and integrating with Kerberos authentication infrastructure- KDC server setup, creating realm /domain, managing
- Principles, generating key tab file for each and every service and managing key tab using key tab tools
- Worked on setting up Name Node high availability for major production cluster and designed Automatic failover control using zookeeper and quorum journal nodes
- Analyzing the clients existing Hadoop infrastructure and understand the performance bottlenecks and provide the performance tuning accordingly
- Experience on maintaining Production databases.
- Experience in Installations & un-Installations of applications in Windows and Linux environment
- Experience in Configuring & maintaining of Mirroring & Log Shipping
- Experienced in support of all phases of Software Development Life Cycle (SDLC)
TECHNICAL SKILLS
Hadoop: Cloudera 4/5 Manager/Hue, AWS, Apache Hadoop, Horton Work HDP, Hive, Sqoop
Tools: Pig, MS ETL, Eclipse, JIRA, MPP, MS Office
Languages: SQL, PL/SQL, Java, Shell Script, Python,Shell
Database/File System: Oracle 10g/9i, SQL Server, DB2, Hadoop HDFS, Amazon Redshift
Operating Systems: Linux and Windows
PROFESSIONAL EXPERIENCE
Confidential, Atlanta, GA
Hadoop Architect
Responsibilities:
- Designing ETL Data Pipeline flow to ingest the data from RDBMS source to Hadoop using shell script, sqoop and mysql
- Creating the cluster planning and capacity reports
- Coded Tcollector /Pythonscripts to fetch timeseries metrics from Cloudera Manger and report them on WaveFront
- Experience in Implementing High Availability of Name Node andHadoopCluster capacity planning to add and remove the nodes
- Creating Control-M job flow to load daily/ Incremental Data
- Creating Sqoop Jobs to import the data and to load into hdfs
- Generating reports using hive
- Ingesting Data from heterogeneous source system and creating the framework for incremental load and data validation
- Maintained SVN repositories forDevOpsenvironment automation code and configuration.
- Creating UI to search manufacturing data in TDW using Play Framework, JavaScripts, HTML
- Analyze current data sources, schema based on use case documentation provided, understanding on developed programs and scripts to complete data ingestion into Hadoop cluster
- Creating architecture, design and operation documentation as instructed by program manager
- Monitoring the cluster and executing benchmark test cases and recommending the configurations
- Implementing Spark on Hive, to improve the query performance
- Report preparation for the business users to analyze
- Working closely with the Data Scientist team to analysis the data, calculate and build the similarity recipe search functionality
- Calculating the etch rate, deposition rate and early detection of defects
- Interacting with Cloudera support team fix installation bug and for fine tuning the cluster for optimal performance
- Designing Hive partition table and file formats to support e3 analytics
- Doing POC, to test the new file format, tools and frameworks
Environment: Cloudera, Hue, HDFS, Pig, Hive, SQL Server, Oracle 10g and UNIX, Spark, Kerberos, MobaX, Putty, WinSCP, VirtualBox, Shell Scripting
Confidential, New York, NY
Hadoop Architect
Responsibilities:
- Installed Namenode, Secondary name node, Yarn (Resource Manager, Node manager, Application master), data node.
- Installed and Configured HDP1
- Responsible for implementation and ongoing administration of Hadoop infrastructure.
- Monitored already configured cluster of 20 nodes.
- Installed and configured Hadoop components Hdfs, Hive, HBase.
- Communicating with the development teams and attending daily meetings.
- Addressing and Troubleshooting issues on a daily basis.
- Worked onDevOpstools like Chef, Artifactory and Jenkins to configure and maintain the production environment
- Working with data delivery teams to setup new Hadoop users. This job includes setting up Linux users, setting up Kerberos principals and testing HDFS, Hive
- Cluster maintenance as well as creation and removal of nodes
- Monitor Hadoop cluster connectivity and security
- Manage and review Hadoop log files.
- File system management and monitoring.
- HDFS support and maintenance.
- Diligently teaming with the infrastructure, network, database, application and business intelligence teams to guarantee high data quality and availability
Environment: Cloudera, HDFS, Hive, SQL Server, Oracle 10g and UNIX, MobaX, Putty, WinSCP, VirtualBox, Shell Scripting
Confidential
Technical Project Lead
Responsibilities:
- Handling the E2E projects from specifications to delivery and support.
- Participating in the functional and design discussions and supporting the requirements and leading them to successfully testing.
- Involved in the requirement changes discussions and enhancements.
- Checking the day to day activitie4s and running the same in the burn-down charts and proceeding to the management in terms of the project statists
- Coordinator with support applications in down and up stream of data
- Getting approvals for changes in the schedules and architecture changes from both the functional and domain experts.
- SPOC for the ticket handling in the JIRA and Quality center.
- Tracking the issues and resolving the issues with the team.
- Created the reports to be presented for the higher mgmt teams and mitigated the issues brought by the management teams.
- Development of modules for the application using Java Advanced 2.0 as the programming language
Environment: Spring and Struts 2.0 along with XML, JavaScript and CSS for GUI. The Java programs had their data retrieved from the database using iBatis and Hibernate.
Confidential, New York
Technical Project Lead
Responsibilities:
- Handling the team from offshore and onsite in delivery.
- Database analysis on trade system applications for providing accurate results in the testing phase.
- My role also demands functional/implementation understanding and support to the project leaders and team members
- Working with SIT, UAT, Production and Warranty Support
- Prioritizing the various tasks and across the teams and groups.
- Role also involves internal reviews and checklists for the defined tasks. I was also involved in building/managing the servers with IES team.
Environment: These servers are build with Windows 2003 servers, this involves installing of the IIS, SSL, Oracle and configuring the application DLL’s.
Confidential
Developer
Responsibilities:
- Functional and technical coordinator
- Single point contact between on site and off shore
- Planning the various activities, prioritizing the tasks, providing feedback to team,
- Defect Analyzing, Design, Performance Testing reviews
- Domain discussions with the clients and other stakeholders
- Reviews of CR/enhancements, team coordination.
- Quality reviews, checklist implementation and version controlling.
Environment: JSP, EJB, Struts Framework, Oracle, PL/SQL, Crystal Reports
Confidential
Developer
Responsibilities:
- Communication with the client and providing feedback to team
- Analyzing, Design, Development and Testing
- Quality management and Report to top level management
- Managing VSS and using iTrack Application for tracking VSS
- Migration data and data verification.
- Application architecture reviews
Environment: JSP, Java, Oracle, Crystal Reports