Hadoop Admin Resume
Atlanta, GA
SUMMARY
- Around 6+ years of professional experience working wif data, which includes hands on experience of 3+ years in analysis, design, development and maintenance of Hadoop and Java based applications.
- Working experience on Hortonworks (HDP, HDF) and Cloudera distribution.
- Excellent experience in Installing and supporting production clusters and handling critical issues.
- Coordinated wif technical teams for integrating third party security related tools like Interset applications into system.
- Performed upgrades, patches and bug fixes in HDP and CDH clusters.
- Experience on building dashboards for operations from FS Image to project existing and forecasted data growth.
- Built various automation plans from operations stand point.
- Participated in building Splunk dashboards for reporting access breaches.
- Hands - on experience security applications like Ranger, Knox and Kerberos.
- Excellent experience working on teh Hadoop Operations on teh ETL infrastructure wif other BI teams like TD and Tableau.
- Knowledge of extracting an Avro schema using avro-tools, XML using XSD and evolving an Avro schema by changing JSON files.
- Strong problem-solving, organizing, team management, communication and planning skills, wif ability to work in team environment. Ability to write clear, well-documented, well-commented and efficient code as per teh requirement.
- Capable of processing large sets of structured, Semi-structured and unstructured data and supporting systems application architecture.
- Able to assess business rules, collaborate wif stakeholders and perform source-to-target data mapping, design and review.
- Familiar wif data architecture including data ingestion pipeline design, Hadoop information architecture, data modeling and data mining, machine learning and advanced data processing.
TECHNICAL SKILLS
Big Data Ecosystems: Hadoop, MapReduce, HDFS, HBase, Zookeeper, Hive, Pig, Sqoop, Spark, Cassandra, Oozie, Flume, kafka and Talend
Programming Languages: Java, C/C++, Scala, Python and shell Scripting
Scripting Languages: JavaScript, XML, HTML, Python and Linux Bash Shell Scripting, Unix
Tools: Eclipse, JDeveloper, JProbe, CVS, MS Visual Studio
Platforms: Windows(2000/XP), Linux, Solaris
Databases: NoSQL, Oracle, DB2, MS SQL Server (2000, 2008), Teradata, Hbase, Cassandra, Cloudera 5.9
PROFESSIONAL EXPERIENCE
Confidential, Atlanta, GA
Hadoop Admin
Responsibilities:
- Design Scalable Hadoop deployment architectures wif features such as High Availability and load - balancing
- Manage more TEMPthan 30 Hadoop Clusters including 10 live Production clusters
- Administer Hadoop Cluster wif 300+ nodes
- Provide on-call support and act as a Hadoop SME across teh organization
- Liaise wif PMs, Developers and Data Scientists to understand requirements and implement solutions
- Secure teh Hadoop clusters using Kerberos, Sentry and Encryption
- Responsible for Performance Tuning, Design, Capacity Planning and Benchmarking Hadoop Clusters
- Plan upgrades, patching and work wif vendor teams to resolve bugs and issues
- Create data and replication pipelines to/from teh Hadoop cluster
- Perform case studies and evaluate new technologies and implement them into current solution
- Build a DR site wif real-time replication of HBase and Hive Data
- Lay out teh Data Ingestion pipeline using tools such as Flume, Kafka, Web HDFS and Sqoop
- Use Hive as a Relational Schema on Hadoop for teh ETL workflows
- Extensively work wif HBase for Heap tuning, JVM tuning, GC tuning, data loading and replication
- Adopt Impala as an in-memory query engine for Analytics tools such as SAS, Tableau and Python Ibis for Data Science workloads
- Automate Cluster Deployment and other activities using Ansible and Cloudera Manager API in Python
- Setup Hadoop Clusters on teh AWS Cloud wif and wifout Cloudera Director
- Collaborate wif teh UNIX Infrastructure team to implement best practices for Hadoop
Confidential, Dallas, TX
Hadoop Admin
Responsibilities:
- Maintaining teh 160 - node production Hadoop cluster along wif smaller stage cluster
- Installed software on Hadoop cluster to support data science and analytics users.
- dis includes but not limited to R, Python packages, Spark, and other Big Data Open source components/services
- Maintaining user accounts/system security and cluster resource management to optimize user query execution time and data loading time
- Helped wif migration of Hadoop to cloud based distribution and halp in deciding and implementing teh new configuration
- Helped wif data migration and issue resolution while migrating teh existing Hadoop/Java applications from on-premise cluster to cloud distribution
- Experienced in supporting data science teams, super users and analytics teams on complex code deployment, debugging and performance optimization problems
- Experienced in Hadoop, HDFS, Hive, Spark, R, Python, Java and UNIX
- System monitoring and controls
- Experienced in managing CPU, memory, storage resources for a large Hadoop cluster wif 100s of users
- Worked exclusively wif Horton works and IBM big insights (4.1).
Confidential, Dallas, TX
Big Data Admin/Developer
Responsibilities:
- Client used Hortonworks distribution of Hadoop to store and process their huge data generated from different enterprises.
- Installed Yarn (Resource Manager, Node manager, Application master) and Created volumes and CLDB in edge nodes.
- Responsible for implementation and ongoing administration of MapR infrastructure.
- Monitored already configured cluster of 54 nodes.
- Installed and configured Hadoop components, Hive, Pig, and Hue.
- Communicating wif teh development teams and attending daily meetings.
- Addressing and Troubleshooting issues daily.
- Launched R - statistical tool for statistical computing and Graphics.
- Working wif data delivery teams to setup new Hadoop users. dis job includes setting up Linux users, setting up Kerberos principals and testing MFS, and Hive.
- Cluster maintenance as well as creation and removal of nodes.
- Monitor Hadoop cluster connectivity and security
- Worked on large sets of structured, semi-structured and unstructured data.
- Use of Sqoop to import and export data from Oracle RDBMS to HDFS and vice-versa.
- Involved in creating Hive tables, loading wif data and writing hive queries which will run internally as MapReduce jobs.
- Diligently teaming wif teh infrastructure, network, database, application and business intelligence teams to guarantee high data quality and availability.
Environment: Apache Hadoop 0.20.203, HDFS, Java MapReduce, Eclipse, Hive, Pig, Sqoop, Oozie and SQL, Oracle 11g.
Confidential
Java Developer
Responsibilities:
- Competency in using XML Web Services by using SOAP to transfer data to supply chain and for domain expertise Monitoring Systems.
- Worked on Maven to build tool for building jar files. Used teh Hibernate framework (ORM) to interact wif teh database.
- Knowledge in struts tiles framework for layout management. Worked on design, analysis, and development and testing various phases of teh application.
- Develop named HQL queries and Criteria for use in application. Developed user interface using JSP and HTML.
- Used JDBC for teh Database connectivity. Involved in projects utilizing Java, Java EE web applications in teh creation of fully - integrated client management systems.
- Consistently met deadlines as well as requirements for all production work orders.
- Executed SQL statements for searching contactors depending on Criteria. Development and integration of teh application using Eclipse IDE.
- Developed Junit for server-side code.
- Involved in building, testing and debugging of JSP pages in teh system. Involved in multi-tiered J2EE design utilizing spring (IOC) architecture and Hibernate.
- Involved in teh development of front end screens using technologies like JSP, HTML, AJAX and JavaScript.
- Configured spring managed beans. Spring Security API is used for configured security.
- Investigated, debug and fixed teh potential bugs in teh implementation code.
Environment: Java, J2EE, JSP, Hibernate, Struts, XML Schema, SOAP, Java Script, PL/SQL, Junit, AJAX, HQL, JSP, HTML, JDBC, Maven, Eclipse
Confidential
Junior Java Developer
Responsibilities:
- Involved in various phases of Software Development Life Cycle (SDLC), such as requirements gathering, modelling, analysis, design and development.
- Ensured clear understanding of customer's requirements before developing teh final proposal.
- Generated Use case diagrams, Activity flow diagrams, Class diagrams and Object diagrams in teh design phase.
- Used Java Design Patterns like DAO, Singleton etc.
- Written complex SQL queries for retrieving and updating data.
- Involved in implementing multithreaded environment to generate messages.
- Used JDBC Connections and WebSphere Connection pool for database access.
- Used Struts tag libraries (like html, logic, tab, bean etc.) and JSTL tags in teh JSP pages.
- Involved in development using Struts components - Struts-config.xml, tiles, form-beans and plug-ins in Struts architecture.
- Configured connection pooling using WebLogic application server.
- Developed and Deployed teh Application on WebLogic using ANT build.xml script.
- Developed SQL queries and stored procedures to execute teh backend processes using Oracle.
- Deployed application on WebLogic Application Server and development using Eclipse.
Environment: Java 1.4, Servlets, JSP, JMS, Struts, Validation Framework, tag Libraries, JSTL, JDBC, PL/SQL, HTML, JavaScript, Oracle 9i (SQL), UNIX, Eclipse 3.0, LINUX, CV