Hadoop Developer Resume
Nashville, TN
SUMMARY
- 8+ years of overall experience in IT Industry which includes experience in Java, Big data technologies and web applications in multi - tiered environment using Java, Hadoop, Hive, HBase, Pig, Sqoop, J2EE (Spring, JSP, Servlets), JDBC, HTML, Java Script(Angular JS).
- 4 years of comprehensive experience in Big Data Analytics.
- Working knowledge in AWS environment and AWS spark.
- Strong experience in Cloud computing platforms such as AWS services.
- Extensive experience in Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, and Map Reduce concepts.
- Well versed in installation, configuration, supporting and managing of Big Data and underlying infrastructure of Hadoop Cluster along with CDH3&4 clusters.
- Worked on designed and implemented a Cassandra based database and related web service for storing unstructured data.
- Experience on NoSQL databases including HBase, Cassandra.
- Designed and implemented a Cassandra NoSQL based database and associated RESTful web service that persists high-volume user profile data for vertical teams.
- Experience in building large scale highly available Web Applications.Working knowledge of web services and other integration patterns.
- Experience in managing and reviewing Hadoop log files.
- Experience in using Pig, Hive, Scoop and Cloudera Manager.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
- Hands on experience in RDBMS, and Linux shell scripting
- Extending Hive and Pig core functionality by writing customUDFs.
- Experience in analyzing data using HiveQL, Pig Latin and Map Reduce.
- Developed Map Reduce jobs to automate transfer of data from HBase.
- Knowledge in job work-flow scheduling and monitoring tools like oozie and Zookeeper.
- Knowledge of data warehousing and ETL tools like Informatica and Pentaho.
- Experienced in Oracle Database Design and ETL with Informatica.
- Mentored, coached, cross-trained junior developers by providing domain knowledge, design advice.
- Proven ability in defining goals, coordinating teams and achieving results.
- Procedures, Functions, Packages, Views, materialized views, function based indexes and Triggers, Dynamic SQL, ad-hoc reporting using SQL.
- Business Intelligence (DW) applications.
- Worked hands on ETL process
- Knowledge of job workflow scheduling and monitoring tools like oozie and Zookeeper, of NoSQL databases such as HBase, Cassandra, and of administrative tasks such as installing Hadoop, Commissioning and decommissioning, and its ecosystem components such as Flume, Oozie, Hive and Pig.
- Extensive experience in using MVC architecture, Struts, Hibernate for developing web applications using Java, JSPs, JavaScript, HTML, jQuery, AJAX, XML and JSON.
- Excellent Java development skills using J2EE, spring, J2SE, Servlets, JUnit, MRUnit, JSP, JDBC.
- Techno-functional responsibilities include interfacing with users, identifying functional and technical gaps, estimates, designing custom solutions, development, leading developers, producing documentation, and production support.
- Excellent interpersonal and communication skills, creative, research-minded, technically competent and result-oriented with problem solving and leadership skills.
- Practical understanding of the Data modeling concepts like star-Schema Modeling, Snowflake Schema Modeling, Fact and Dimension tables.
- Collaborate with data architects for data model management and version control and conduct data model reviews with project team members and create data objects(DDL).
- Collaborate with BI teams to create reporting data structures.
TECHNICAL SKILLS
Programming Languages: C, C++, Java, Shell Scripting, PL/SQL
J2EE Technologies: Spring, Servlets, JSP, JDBC, Hibernate.
Big Data Ecosystem: HDFS, HBase, Map Reduce, Hive, PigSqoop, Impala, Cassandra, Oozie, Zookeeper, Flume.
DBMS: Oracle 11g, SQL Server, MySQL.
Modeling Tools: UML on Rational Rose 4.0
Web Technologies: HTML, JavaScript, XML, jQuery, Ajax, CSS
Web Services: Web Logic, Web Sphere, Apache Cassandra, Tomcat
IDEs: Eclipse, Net beans, WinSCP.
Operating systems: Windows, UNIX, Linux (Ubuntu), Solaris, Centos.
Version and Source Control: CVS, SVN.
Servers: Apache Tomcat.
Frameworks: MVC, Struts, Log4J, Junit, Maven, ANT, Web Services.
PROFESSIONAL EXPERIENCE
Confidential - Nashville, TN
Hadoop Developer
Responsibilities:
- Develop, automate and maintain scalable Cloud infrastructure to help process of ters bytes of data.
- Solve most of the issues in hive and introduce best tools to optimize the query for good performance.
- Work with datascience team and software engineers to automate and scale their work.
- Automate/build scalable infrastructure in AWS.
- Design and implemented Hive and Pig UDF’s for evaluation, filtering, loading and storing of data.
- The Hive tables created as per requirement were internal or external tables defined with appropriate static and dynamic partitions, intended for efficiency.
- Load and transformed large sets of structured, semi-structured using Hive and Impala.
- Connected Hive and Impala to tableau reporting tool and generated graphical reports.
- Configure Presto ODBC/JDBC test and documented and proved the efficiency in querying data for business use.
- Worked on Active Batch Directory to automate the incremental scripts and observe lot of issues and solved.
- Wrote lot of scripts in Redshift and modify scope to be in Redshift instead of Hive due to relational data.
- Wrote Lambda functions to stream the incoming data from API’s and created the table sin DynamoDB and then ingest to AWS S3.
- Created clusters in EMR using applications hive, spark, hue, Zeppelin-sandbox, Ganglia, Presto-sandbox and scale up the nodes. Currently, I am able to launch the automated cluster by using Python Scripts.
- Build the framework for incremental queries by using shell scripts and work in SQL server, Postgresql.
- Participated in multiple big data POC to evaluate different architectures, tools and vendor products.
- Solved lot of issues in hive, impala and presto. Dealed with large datasets.
- Analyze the big datasets and change the existing workflow for efficiency and work on Agile methodology.
Confidential - Santa Clara, CA
Hadoop Developer/Administrator
Responsibilities:
- Installed and configure MapReduce, HIVE and the HDFS; implemented CDH5 Hadoop cluster on CentOS. Assited with performance tuning and monitoring.
- Conducted code reviews to ensure systems operations and prepare code modules for staging.
- Role of project manager for this project that contribution to manage and estimation activities.
- Run scrum based agile development group.
- Plays a key role in driving a high performance infrastructure strategy, architecture, scalability.
- Utilized high-level information architecture to design modules for complex programs.
- Write scripts to automate application deployments and configurations. Monitoring YARN applications.
- Implemented HAWQ to render queries faster than any other Hadoop-based query interface
- Wrote map reduce programs to clean and pre-process the data coming from different sources.
- Implemented various output formats like Sequence file and parquet format in Map reduce programs. Also, implemented multiple output formats in the same program to match the use cases.
- Implemented test scripts to support test driven development and continuous integration.
- Converted text files into Avro then to parquet format for the file to be used with other Hadoop eco system tools.
- Experienced on loading and transforming of large sets of structured, semi structured and unstructured data.
- Exported the analyzed data to HBase using Sqoop and to generate reports for the BI team.
- Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
- Participate in requirement gathering and analysis phase of the project in documenting the business requirements by conducting workshops/meetings with various business users.
- Worked on external HAWQ tables where the data is loaded directly from CSV files then load them into internal tables.
- Responsible for implementation and ongoing administration of Hadoop infrastructure.
- Performance tuning of Hadoop clusters and Hadoop MapReduce routines.
- Manage and review Hadoop log files.
- File system management and monitoring.
- Point of Contact for Vendor escalation.
Environment: Map Reduce, HDFS, Hive, Pig, Hue, Oozie, Core Java, Eclipse, Hbase, Flume, Cloudera Manager, Oracle 10g, DB2, IDMS, VSAM, SQL*PLUS, Toad, Putty, Windows NT, UNIX Shell Scripting, PentahoBigdata, YARN, HawQ, SpringXD,CDH.
Confidential - Omaha, NE
Hadoop Developer
Responsibilities:
- Developed the application using Struts Framework that leverages classical Model View Controller (MVC) architecture.
- Extensively worked on User Interface for few modules using JSPs, JavaScript and Ajax
- Created Business Logic using Servlets, POJO's and deployed them on Web logic server
- Involved in running Hadoop jobs for processing millions of records of text data.
- Responsible for Cluster maintenance, adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, manage and review data backups and log files.
- Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop Clusters.
- Monitored multiple Hadoop clusters environments using Ganglia.
- Managing and scheduling Jobs on a Hadoop cluster.
- Involved in defining job flows, managing and reviewing log files.
- Monitored workload, job performance and capacity planning using Cloud era Manager.
- Installed Oozie workflow engine to run multiple Map Reduce, Hive and Pig jobs.
- Implemented Map Reduce programs on log data to transform into structured way to find user information.
- Responsible for loading and transforming large sets of structured, semi structured and unstructured data.
- Collected the log data from web servers and integrated into HDFS using Flume.
- Responsible to manage data coming from different sources.
- Extracted files from Couch DB and placed into HDFS using Sqoop and pre-process the data for analysis.
- Gained experience with No SQL database.
- Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts.
Environment: JDK 1.5, J2EE 1.4, Struts 1.3, JSP, Spring, Servlets 2.5, WebSphere 6.1, HTML, XML, JavaScript, Hadoop, AWS, HDFS, Pig, Hive, Map Reduce, HBase, Sqoop, Oozie, Ganglia and Flume.
Confidential - Alpharetta, GA
Java/J2EE developer
Responsibilities:
- Designed and developed Struts like MVC 2 Web framework using the front-controller design pattern, which is used successfully in a number of production systems.
- Spearheaded the "Quick Wins" project by working very closely with the business and end users to improve the current website's ranking from being 23rd to 6th in just 3 months.
- Normalized Oracle database, conforming to design concepts and best practices.
- Resolved product complications at customer sites and funneled the insights to the development and deployment teams to adopt long term product development strategy with minimal roadblocks.
- Convinced business users and analysts with alternative solutions that are more robust and simpler to implement from technical perspective while satisfying the functional requirements from the business perspective.
- Applied design patterns and OO design concepts to improve the existing Java/JEE based code base.
- Identified and fixed transactional issues due to incorrect exception handling and concurrency issues due to unsynchronized block of code.
Environment: Java 1.2/1.3, Swing, Applet, Servlet, JSP, custom tags, JNDI, JDBC, XML, XSL, DTD, HTML, CSS, Java Script, Oracle, DB2, PL/SQL, Web logic, JUnit, Log4J and CVS.
Confidential - Seattle, WA
Java/J2EE Developer
Responsibilities:
- Involved in various phases of Software Development Life Cycle (SDLC) as design development and unit testing.
- Developed and deployed UI layer logics of sites using JSP, XML, JavaScript, HTML/DHTML, and Ajax.
- CSS and JavaScript were used to build rich internet pages.
- Agile Scrum Methodology been followed for the development process.
- Designed different design specifications for application development that includes front-end, back-end using design patterns.
- Developed proto-type test screens in HTML and JavaScript.
- Involved in developing JSP for client data presentation and, data validation on the client side with in the forms.
- Developed the application by using the Spring MVC framework.
- Collection framework used to transfer objects between the different layers of the application.
- Developed data mapping to create a communication bridge between various application interfaces using XML, and XSL.
- Spring IOC being used to inject the parameter values for the Dynamic parameters.
- Developed JUnit testing framework for Unit level testing.
- Actively involved in code review and bug fixing for improving the performance.
- Documented application for its functionality and its enhanced features.
- Created connection through JDBC and used JDBC statements to call stored procedures.
Environment: Spring MVC, Oracle 11g J2EE, Java, JDBC, Servlets, JSP, XML, Design Patterns, CSS, HTML, JavaScript 1.2, JUnit, Apache Tomcat, My SQL Server 2008.
Confidential
System Analyst
Responsibilities:
- Involved in software development life cycle (SDLC) of the project (analysis, design, development, testing and implementation).
- Used MS-Visio for analysis & design flow and monitored the initial applications prototype development for the project.
- Leading a team of 4 developers and point of contact for onshore / offshore communication.
- Designed and developed web pages using JSP, JDBC, Servlets, ASP and ASP.NET.
- Developed modules using core java, C#, VB.Net and VB6.0.
- Used HTML, CSS, XML and JavaScript to design a page.
- Successfully migrated legacy application written in VB6.0 to VB.Net.
- Developed Web Services to get data from the external system in terms of .txt file to load into the database.
- Developed DTS/SSIS packages to load employee details into row Mony tables of the SQL server for further processing.
- Wrote Stored Procedures, functions and complex SQL queries for database operations.
- Used Java Scripts to perform client side validations.
- Worked on performance tuning of queries.
- Developed reports using Crystal Reports reporting tool.
- Have used Data Grid and Grid View Controls to display data in a customized format in the ASP.NET web pages.
- Have used LDAP and Active Directory Search Interface (ADSI) to authenticate and authorize user.
- Involved in unit testing and production support of the application.
- Defects were managed through the Remedy
Environment: Java, J2EE, JSP, Servlets, .Net Framework 2.0, ASP.NET, C#, VB.NET, ADO.NET, Oracle9i, SQL Server 2005, T-SQL/PL-SQL, HTML, XML, Web Services, JavaScript, Windows 2000, IIS, Tomcat, Visual Source Safe (VSS), Remedy and Crystal Reports.
Confidential
Java Developer & Support
Responsibilities:
- Provide L3 application support as primary on call.
- Involved in the development of Report Generation module which includes volume statistics report, Sanctions Monitoring Metrics report, and TPS report.
- Implemented Online List Management (OLM) and FMM module using spring and Hibernate.
- Wrote various SQL, PL/SQL queries and stored procedures for data retrieval.
- Created Configuration files for the application using spring framework.
Environment: Core Java, J2EE, JSP, Servlets, JQuery, JavaScript, CSS, HTML, SQL.