Hadoop Developer Resume
Chicago, IL
SUMMARY
- A Hadoop and Java certified professional with over 6 years of IT experience includes 3 Plus years of experience in Big Data, Hadoop Eco System related technologies with domain experience in Financial, Banking, Health Care, Retail and Non - profit Organizations in Software Development and support of applications.
- In depth knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce concepts.
- Expertise in writing Hadoop Jobs for analyzing data using Hive QL, Pig Latin, and custom MapReduce programs in Java.
- Created Hive tables to store data into HDFS and processed data using Hive QL
- Excellent knowledge ofHadoopEcosystem including HDFS, MapReduce, Hive, Pig, Spark, YARN, HBase, Oozie, Flume and Scoop based Big Data Platforms.
- Expertise in design and implementation of Big Data solutions in Banking, Retail and E-commerce domains.
- Experienced with NoSQL databases like HBase, Cassandra
- Excellent ability to use analytical tools to mine data and evaluate the underlying patterns.
- Good knowledge in Linux shell scripting or shell commands
- Assisted in Cluster maintenance, Cluster Monitoring, Managing and Reviewing data backups and log files.
- Experience in Big Data platforms like Hortonworks, Cloudera, Amazon EC2 and Apache.
- Experience in cluster administration of Hadoop 2.2.0.
- Implemented business logic by writing Pig Latin UDFs in Java and used various UDFs from Piggybanks and other sourcesHIVE
- Analyzed the data by performing Hive queries and used HIVE UDF's for complex queryingNoSQL.
- Expertise in Hive Query Language (HiveQL), Hive Security and debugging Hive issues.
- Experience in Big Data platforms like Hortonworks, Cloudera, Amazon EC2 and Apache.
- Experience in writing real time query processing using Cloudera Impala.
- Experience in Administering and Installation of Hadoop clusters using Cloudera Manager.
- Expertise in optimizing traffic across network using Combiners, joining multiple schema datasets using Joins and organizing data using Partitions and Buckets.
- Hands on experience in dealing with ETL tools like Informatica and talend
- Expert in usingJavatechnologies which involves designing and developing applications that comprised of Struts, Spring Core, Spring, Servlets, JSP, JavaScript, XML, J2EE, JDBC.
- Good experience in Modelling tools for UML design using My Eclipse developed UML diagrams
TECHNICAL SKILLS
Languages and Hadoop Components: Hadoop, Sqoop, Flume, Hive, Pig, MapReduce, YARN, Oozie, Kafka, Spark, Impala, Storm, Hue, Java, SQL.
SQL and NoSQL Databases: HBase, Cassandra, Oracle, MySQLBig data
Platforms: Hortonworks, Cloudera, Amazon AWS, Apache
Web Technologies: JSP, HTML, CSS, Apache Tomcat, lighttpd (lighty)
Operating Systems: Linux, UNIX, Windows 98/XP/Vista/7/8/10, IOS, MAC
Methodologies: Agile, Rapid Application Development, Waterfall Model, Iterative Model
Frameworks: Hibernate, EJB, Struts, Springs
PROFESSIONAL EXPERIENCE
Confidential, Durham, NC
Java & Hadoop Developer
Environment: Apache Hadoop 2.2.0, Cloudera, MapReduce, Eclipse, Hive, HDFS, Impala, Java 1.7, UNIX, Shell Scripting, Javascript, HTML5, CSS, JDBC to Hive/Impala
RESPONSIBILITIES:
- Developed multiple codes for validate and load procedure of three different file segments.
- Data Validation: Had to compose a Java Program to go inside each and every file, to check if the data in it was legit
- Developed a code to convert the needed excel file to txt and push it to HDFS.
- Used cronjob to automate this process.
- Using Hadoop jars I passed my code as arguments in UNIX to run, validate and load procedures.
- Created impala tables for populating the data from these files.
- Used JDBC Hive/Impala connector to insert data into tables.
- Designed and developed a web app/GUI using JavaScript, HTML5, CSS with struts framework.
- Deployed the web app in apache tomcat server.
- Maintained the cluster for a temporary period of time.
- Used JDBC connection to query data from the impala tables.
- Wrote queries for accessing data from the tables and appending to their respective report table in impala.
- Made reports available for download purposes.
- Worked on the detailed design of the Project.
- Worked efficiently in delivering the task on time.
Confidential, Trenton, NJ
Hadoop Developer
Environment: Apache Hadoop 2.2.0, Cloudera, MapReduce, Hive, Hbase, HDFS, PIG, Sqoop, Impala, Spark, Oozie,Java 1.7, UNIX, Shell Scripting, XML.
RESPONSIBILITIES:
- Importing data from relational data stores toHadoopusing Sqoop.
- Creating various Hive and Pig Latin scripts for performing ETL transformations on the transactional and application specific data sources.
- Wrote and executed PIG scripts using Grunt shell.
- Big data analysis using Pig and User defined functions (UDF).
- Performed joins, group by and other operations in Hive and PIG.
- Processed and formatted the output from PIG, Hive before sending to theHadoopoutput file.
- Used HIVE definition to map the output file to tables.
- Wrote map reduce/HBase jobs.
- Reviewed the HDFS usage and system design for future scalability and fault-tolerance.
- Used Impala for real time query processing in Cloudera.
- Worked with HBASE NOSQL database.
- Worked with Apache Spark for quick analytics on object relationships.
- Created UDF’s to encrypt the customer sensitive data and stored into HDFS and performed analysis using PIG.
- Effective working with the team in performing the big data tasks and delivering the projects in time.
- Involved in cluster setup meetings with the administration team.
Confidential, Atlanta, GA
Hadoop Developer
Environment: Hadoop 2.2.0, Map Reduce, Mongo, Yarn, Hive, Pig, HBase, Oozie, Sqoop, FlumeCore Java, Hortonworks, HDFS, Eclipse.
RESPONSIBILITIES:
- Worked on Hortonworks platform.
- Developed data pipeline using Flume and Sqoop to ingest customer behavioral data and financial histories from traditional databases into HDFS for analysis.
- Involved in writing Map Reduce jobs.
- Involved in Sqoop, HDFS Put or Copy from Local to ingest data.
- Used Pig to do transformations, event joins, filter boot traffic and some pre-aggregations before storing the data onto HDFS.
- Involved in developing Pig UDFs for the needed functionality that is not available from Apache Pig.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
- Involved in developing Hive DDLs to create, alter and drop Hive tables.
- Involved in developing Hive UDFs for the needed functionality that is not available from Apache Hive.
- Computed various metrics using Java Map Reduce to calculate metrics that define user experience, revenue etc.
- Responsible for developing data pipeline using flume, Sqoop and pig to extract the data from weblogs and store in HDFS.
- Involved in using SQOOP for importing and exporting data into HDFS.
- Involved in processing ingested raw data using Map Reduce, Apache Pig and Hive.
- Involved in emitting processed data fromHadoopto relational databases or external file systems using SQOOP, HDFS GET or CopyToLocal.
- Involved in developing Shell scripts to orchestrate execution of all other scripts (Pig, Hive, and Map Reduce) and move the data files within and outside of HDFS.
Confidential, Chicago, IL
Hadoop Developer
Environment: HBase, Oozie, Pig, Flume, Sqoop, Cloudera CDH4, AWS EC2 cloud, HadoopHDFS, Map Reduce, Hive.
Responsibilities:
- Responsible for codingMapReduceprogram,Hivequeries, testing and debugging theMapReduceprograms.
- DevelopedPigLatinscripts in the areas where extensive coding needs to be reduced to analyze large data sets.
- UsedSqooptool toextractdata from arelationaldatabase intoHadoop.
- Worked closely with data warehouse architect and business intelligence analyst to develop solutions.
- Involved increatingHiveTables,loadingwith data and writingHivequeries, which will invoke and runMapReducejobs in the backend.
- Installed and configured Hadoop cluster in DEV, QA and Production environments.
- Performed upgrade to the existing Hadoop clusters.
- Enabled Kerberos for Hadoop clusterAuthenticationand integrate with Active Directory for managing users and application groups.
- ImplementedCommissioningandDecommissioningofnewnodesto existing cluster
- Worked with systems engineering team for planning new Hadoop environment deployments, expansion of existing Hadoop clusters.
- Monitoring workload, job performance and capacity planning usingClouderaManager.
- Worked with application teams to install OS level updates, patches and version upgrades required for Hadoop cluster environments.
- Supported in setting up QA environment and updating configurations for implementing scripts withPig, Hiveand Sqoop
Confidential, New York, NY
Hadoop Developer
Environment: HDFS, Map Reduce, Hive, Sqoop, Pig, Flume, HBase, Oozie Scheduler, Java, Shell Scripts.
Responsibilities:
- Part of team for developing and writing PIG scripts.
- Loaded the data from RDBMS SERVER to Hive using Sqoop.
- Created Hive tables to store the processed results in a tabular format.
- Developed the Sqoop scripts in order to make the interaction between Hive and MySQL Database.
- Developed Java Mapper and Reducer programs for complex business requirements.
- Developed Java custom record reader, partitioner and serialization techniques.
- Used different data formats (Text format and Avro format) while loading the data into HDFS.
- Created Managed tables and External tables in Hive and loaded data from HDFS.
- Performed complex HiveQL queries on Hive tables.
- Optimized the Hive tables using optimization techniques like partitions and bucketing to provide better performance with HiveQL queries.
- Created partitioned tables and loaded data using both static partition and dynamic partition method.
- Created custom user defined functions in Hive.
- Performed SQOOP import from Oracle to load the data in HDFS and directly into Hive tables.
- Developed Pig Scripts to store unstructured data in HDFS.
- Scheduled map reduce jobs in production environment using Oozie scheduler.
- Analyzed the Hadoop logs using PIG scripts to oversee the errors caused by the team.
Confidential
Java and SQL Developer
Environment: Java, JBDC, Eclipse, Oracle, SQL, XML
Responsibilities:
- Followed AGILE Methodology with SCRUM Meetings and involved in maintaining Sprint backlogs during the development cycles.
- Involved in interacting with the Business Analyst and Architect during the Sprint Planning Sessions.
- Responsible in designing UML diagrams like: Class and Sequence diagrams during the Analysis and Design phase of the application.
- Used Spring MVC to handle/intercept the user requests and used various controllers to delegate the request flow to the Backend tier of the application.
- Involved in configuring faces - config.xml.
- Used Spring Core (Inversion of Control)/DI (Dependency Injection) to wire the object dependencies across the application.
- Used Spring Security for Authentication and authorization extensively.
- Used Hibernate ORM Framework for Data persistence and transaction management.
- Involved in creating the Hibernate POJO Objects and developed Hibernate mapping Files.
- Involved in writing complex HQL, Stored Procedures to handle the persistence operations.
Confidential
Java Application Developer
Environment: Java, J2ee, MySQL, Windows, UNIX.
Responsibilities:
- Analyzed all business functionality related to Confidential services.
- Developed technical specifications for various back end modules from business requirements.
- Experience in EJB, create new as well as modified the existing as per requirement.
- Developed back end interfaces using Business Delegates and Data Access Objects (DAO) for interacting with Informix.
- Used EJB QL for retrieving data.
- Utilized Java Mail service to communicate between GEMA and Non GEMA application.
- Responsible for SQL tuning and optimization using Analyze, Explain Plan, TKPROF utility and optimizer hints.
- Suggested and converted several existing UI for better user interaction.
- Has developed JSP’s as part of UI layer
- Was involved in unit testing and System testing for new Requirements.
- Involved in communication with Business people for clarification on Business Requirements.
- Has developed test cases for business functionalities