Hadoop Developer Resume
Los Angeles, CA
SUMMARY
- IT professional with 8+ years of experience in Analysis, Design, Development, Integration, Testing and maintenance of various applications using JAVA /J2EEtechnologies along with 4+ years of BigData/Hadoop experience.
- Expertise in Bigdata architecture with Hadoop File system and its eco system tools MapReduce, HBase, Hive, Pig, Zookeeper, Oozie, Flume, Avro, Impala and Apache spark.
- Experienced in building highly scalable Big - data solutions using Hadoop and multiple distributions i.e. Cloudera, Horton works and NoSQL platforms (Hbase & Cassandra).
- Solid understanding of Hadoop MRV1 and Hadoop MRV2 (or) YARN Architecture.
- Experience in writing Map Reduce programs and using Apache Hadoop API for analyzing the data.
- Strong experience in developing, debugging and tuning Map Reduce jobs in Hadoop environment.
- Expertise in developing PIG and HIVE scripts for data analysis
- Hands on experience in data mining process, implementing complex business logic and optimizing the query using HiveQL and controlling the data distribution by partitioning and bucketing techniques to enhance performance.
- Experience working with Hive data, extending the Hive library using custom UDF's to query data in non-standard formats
- Experience in performance tuning of Map Reduce, Pig jobs and Hive queries
- Involved in the Ingestion of data from various Databases like TERADATA (Sales Data Warehouse), Oracle, DB2, SQL-Server using Sqoop
- Used Compression Techniques (snappy) with file formats to leverage the storage in HDFS
- Working knowledge in Hadoop HDFS Admin Shell commands.
- Developed core modules in large cross-platform applications using JAVA, J2EE, Hibernate, Python, Spring, JSP, Servlets, EJB, JDBC, JavaScript, XML, and HTML.
- Experienced with build tools Maven, ANT and continuous integrations like Jenkins.
- Working Knowledge in configuring and monitoring tools like Ganglia and Nagios.
- Hands-on experience in using relational databases like Oracle, MySQL, PostgreSQL and MS-SQL Server.
- Extensive experience in developing and deploying applications using Web Logic, Apache Tomcat and JBOSS.
- Developed Unit test cases using Junit, Easy Mock and MRUnit testing frameworks.
- Experienced with version controller systems like SVN, Clear case.
- Experience using IDEs tools Eclipse 3.0, My Eclipse, RAD and NetBeans
- Expertise in Waterfall and Agile software development model & project planning using Microsoft Project Planner and JIRA.
- Highly motivated, dynamic, self-starter with keen interest in emerging technologies
TECHNICAL SKILLS
BigData Technologies: HDFS, MapReduce, Hive, Pig, Sqoop, Flume, Oozie, Hadoop Streaming, Zookeeper, Apache Spark
Hadoop Distributions: Cloudera (CDH4/CDH5), Horton Works, MapR
Languages: C, C++, Java SQL, PL/SQL, PIG-Latin, HQL
IDE Tools: Eclipse, NetBeans
Framework: Hibernate, Spring, Struts, Junit
Web Technologies: HTML5, CSS3, JavaScript, JQuery, AJAX, Servlets, JSP, JSON, XML, XHTML
Operating Systems: Windows (XP,7,8), UNIX, LINUX, Ubuntu, CentOS, Mac OS
Application Servers: Jboss, Tomcat, Web Logic, Web Sphere
Databases: Oracle, MySQL, DB2, Derby, PostgreSQL, No-SQL Database (Hbase, Cassandra)
PROFESSIONAL EXPERIENCE
Confidential - Los Angeles, CA
Hadoop Developer
Responsibilities:
- Extracted and updated the data into HDFS using sqoop import and export command line utility interface.
- Responsible for developing data pipeline using flume, Sqoop and pig to extract the data from weblogs and store in HDFS.
- Involved in developing Hive UDFs for the needed functionality.
- Involved in creating Hive tables, loading with data and writing hive queries.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
- Developed UI for Customer Service Modules and Reports using JSF, JSP's and My Faces Components Log4j used for logging the application log of the running system to trace the errors and certain automated routine functions.
- Used Pig to do transformations, event joins, filter boot traffic and some pre-aggregations before storing the data onto HDFS.
- Participated in daily scrum meetings and iterative development.
- Loaded all the data from the existing DWH tables (SQL Server) to HDFS using Sqoop.
- Loaded cache data in to Hbase using Sqoop.
- Worked on planning, deployment, and support, and services needed to deliver better service level agreements and generate greater end-user satisfaction.
Environment: Hadoop, Map Reduce, HDFS, Pig, Hive, Sqoop, Flume, Oozie, Java, Linux, Maven, Sql Server, Zookeeper, Hbase, Cassandra, Apache Spark.
Confidential - Dallas, TX
Java/Hadoop Developer
Responsibilities:
- Responsible for building scalable distributed data solutions using Hadoop.
- Written multiple MapReduce programs in Java for Data Analysis
- Performed performance tuning and troubleshooting of MapReduce jobs by analyzing and reviewing Hadoop log files
- Experienced in migrating HiveQL into Impala to minimize query response time.
- Experienced in data migration from relational database to Hadoop HDFS.
- Responsible for creating Hive tables, loading the structured data resulted from MapReduce jobs into the tables and writing hive queries to further analyze the logs to identify issues and behavioral patterns.
- Worked on Sequence files, RC files, Map side joins, bucketing, partitioning for Hive performance enhancement and storage improvement.
- Performed extensive Data Mining applications using HIVE.
- Use of Impala to create and manage Parquet tables.
- Responsible for performing extensive data validation using Hive
- Sqoop jobs, PIG and Hive scripts were created for data ingestion from relational databases to compare with historical data.
- Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
- Implemented business logic by writing Pig UDFs in Java and used various UDFs from Piggybanks and other sources
- Implemented Hive Generic UDF's to implement business logic.
- Setup Hadoop cluster on Amazon EC2 using whirr for POC
- Made Hadoop management simpler by developing software for provisioning, managing, and monitoring Apache Hadoop clusters using Ambari and provided an intuitive, easy-to-use Hadoop management web UI backed by its RESTful APIs.
Environment: Hadoop, Map Reduce, HDFS, Pig, Hive, Sqoop, Flume, Oozie, Java, Linux, Maven, Teradata, Zookeeper, SVN, Hbase, Cassandra.
Confidential - Foster City, CA
Java/Hadoop Developer
Responsibilities:
- Responsible for building scalable distributed data solutions using Hadoop.
- Written multiple MapReduce programs in Java for Data Analysis
- Performed performance tuning and troubleshooting of MapReduce jobs by analyzing and reviewing Hadoop log files
- Experienced in migrating HiveQL into Impala to minimize query response time.
- Experienced in data migration from relational database to Hadoop HDFS.
- Responsible for creating Hive tables, loading the structured data resulted from MapReduce jobs into the tables and writing hive queries to further analyze the logs to identify issues and behavioral patterns.
- Worked on Sequence files, RC files, Map side joins, bucketing, partitioning for Hive performance enhancement and storage improvement.
- Performed extensive Data Mining applications using HIVE.
- Use of Impala to create and manage Parquet tables.
- Responsible for performing extensive data validation using Hive
- Sqoop jobs, PIG and Hive scripts were created for data ingestion from relational databases to compare with historical data.
- Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
- Implemented business logic by writing Pig UDFs in Java and used various UDFs from Piggybanks and other sources
- Implemented Hive Generic UDF's to implement business logic.
- Setup Hadoop cluster on Amazon EC2 using whirr for POC
- Made Hadoop management simpler by developing software for provisioning, managing, and monitoring Apache Hadoop clusters using Ambari and provided an intuitive, easy-to-use Hadoop management web UI backed by its RESTful APIs.
Environment: Hadoop, Map Reduce, HDFS, Pig, Hive, Sqoop, Flume, Oozie, Java, Linux, Maven, Teradata, Zookeeper, SVN, Hbase, Cassandra.
Confidential, NYC
Linux Admin/Java developer
Responsibilities:
- Worked with business analyst in understanding business requirements, design and development of the project.
- Implemented the Struts frame work with MVC architecture.
- Developed the presentation layer using JSP, HTML, CSS and client side validations using JavaScript.
- Collaborated with the ETL/ Informatica team to determine the necessary data models and UI designs to support Cognos reports.
- Used TEX for many typesetting tasks, especially in the form of Latex, Context, and other template packages.
- Performed several data quality checks and found potential issues, designed Ab Initio graphs to resolve them.
- Applied J2EE design patterns like Business Delegate, DAO and Singleton.
- Deployed and tested the application using Tomcat web server.
- Usingjavascripts did client side validation.
- Involved in developing DAO's using JDBC.
- Involved in coding, code reviews, JUnit testing, Prepared and executed Unit Test Cases.
- Writing SQL queries to fetch the business data using Oracle as database.
Environment: Java, JSP, JavaScript, Servlets, Struts, Hibernate, EJB, JSF, JSP, Ant, Tomcat, CVS, Eclipse, SQLDeveloper, Oracle.
Confidential
Linux Admin
Responsibilities:
- Responsible for handling the tickets raised by the end users which includes installation of packages, login issues, access issues
- User management like adding, modifying, deleting, grouping.
- Responsible for preventive maintenance of the servers on monthly basis. Configuration of the RAID for the servers.
- Resource management using the Disk quotas.
- Documenting the issues on daily basis to the resolution portal.
- Responsible for change management release scheduled by service providers.
- Generating the weekly and monthly reports for the tickets that worked on and sending report to the management.
- Managing Systems operations with final accountability for smooth installation, networking, and operation, troubleshooting of hardware and software in LINUX environment.
- Identifying operational needs of various departments and developing customized software to enhance System's productivity.
- Running LINUX SQUID Proxy server with access restrictions with ACLs and password.
- Established/implemented firewall rules, Validated rules with vulnerability scanning tools.
- Proactively detecting Computer Security violations, collecting evidence and presenting results to the management.
- Accomplished System/e-mail authentication using LDAP enterprise Database.
- Implemented a Database enabled Intranet web site using LINUX, Apache, MySQL Database backend.
- Installed Cent OS using Pre-Execution environment boot and Kick-start method on multiple servers.
- Monitoring System Metrics and logs for any problems.
- Running Cron-tab to back up Data.
- Applied Operating System updates, patches and configuration changes.
- Maintaining the MySQL server and Authentication to required users for Databases.
- Appropriately documented various Administrative & technical issues