Hadoop Administrator/ Developer Resume
Memphis, TN
SUMMARY
- Around 8+ years of Information Technology experience with 4 years of experience in Hadoop Ecosystem.
- Worked with Big Data distributions like Cloudera (CDH 3 and 4) with Cloudera Manager.
- Strong experience in using Hive for processing and analyzing large volume of data.
- Expert in using Sqoop for fetching data from different systems to analyze in HDFS and putting it back to the previous system for further processing.
- Expert in creating PIG and HiveUDFs using Java in order to analyze the data efficiently.
- Good experience in Spark SQL and Spark Streaming and good understanding about RDDs.
- Experience in kafka for real time streaming applications for react to the stream data.
- Also used HBase in accordance with PIG/Hive as and when required for real time low latency queries.
- Worked on Impala, Hue for integrating hive megastore to share information under shared system.
- Experience in NoSql database like MongoDB and HBase.
- Strong knowledge of Software Development Life Cycle (SDLC).
- Experienced in creating and analyzing Software Requirement Specifications (SRS) and Functional Specification Document (FSD).
- Worked in Windows, UNIX/LINUX platform with different Technologies such as Big Data, SQL, PL/SQL, XML, HTML, Core Java, Shell Scripting.
- Good communication interpersonal skills, committed, result oriented, hard working with a quest to learn new technologies.
- Experienced to work with multi - cultural environment with a team and also individually as per the project requirement.
- Experience of creating Map Reduce codes in Java as per the business requirements.
- Extensive knowledge in creating PL/SQL stored Procedures, packages, functions, cursors etc. against Oracle (9i, 10g, 11g), and MySQL server.
- Having strong technical skills in Core Java with working knowledge.
- Worked in ETL tools like Talend to simplify Map Reduce jobs from the front end. Also have knowledge of Informatica IBM InfoSphere as another working ETL tool with Big Data.
- Worked with BI tools like Tableau for report creation and further analysis from the front end.
- Extensive knowledge in using SQL queries for backend database analysis.
- Expert in implementing advanced procedures like Text analytics and processing using the in-memory
- Involved in developing distributed Enterprise and Web applications using UML, Java/J2EE, Web technologies that include EJB, JSP, Servlets, JMS, JDBC, JPA HTML, XML, Tomcat, spring and Hibernate.
- Expertise in Defect Management and Defect Tracking to do performance tuning for delivering utmost Quality product.
- Experienced in provided training to team members as new per the project requirement.
TECHNICAL SKILLS
Hadoop/Big Data Technologies: HDFS, Map Reduce, Sqoop, Pig, Hive, Oozie, impala, Spark, Zookeeper and Cloudera Manager.
NO SQL Database: HBase
Monitoring and Reporting: Tableau, Custom shell scripts
Hadoop Distribution: Horton Works, Cloudera, MapR
Build Tools: Maven, SQL Developer
Programming & Scripting: JAVA, C, SQL, Shell Scripting
Java Technologies: Servlets, JavaBeans, JDBC, Spring, Hibernate
Databases: Oracle, MY SQL, MS SQL server, Teradata
Web Dev. Technologies: HTML, XML, JSON, CSS
Version Control: SVN, CVS, GIT
Operating Systems: Linux, Unix, Mac OS-X, Windows 8, Windows 7, Windows Server 2008/2003
PROFESSIONAL EXPERIENCE
Confidential, Memphis, TN
Hadoop Administrator/ Developer
Responsibilities:
- Extracted and updated the data into HDFS using Sqoop import and export command line utility interface.
- Responsible for developing data pipeline using Flume, Sqoop, and Pig to extract the data from weblogs and store in HDFS.
- Involved in developing Hive UDFs for the needed functionality.
- Involved in creating Hive tables, loading with data and writing Hive queries.
- Managed works including indexing data, tuning relevance, developing custom tokenizes and filters, adding functionality includes playlist, custom sorting and regionalization with Solr search engine.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
- Used pig to do transformations, event joins, filter boot traffic and some pre-aggregations before storing the data onto HDFS.
- Implemented advanced procedures like text analytics and processing using the in-memory computing capabilities like spark.
- Enhanced and optimized product Spark code to aggregate, group and run data mining tasks using the Spark framework.
- Extending Hive functionality by writing custom UDFs.
- Experience in managing and reviewing Hadoop log files
- Developed data pipeline using Flume, Sqoop, pig and java MapReduce to ingest customer behavioral data and financial histories into HDFS for analysis.
- Involved in emitting processed data from Hadoop to relational databases and external file systems using Sqoop.
- Orchestrated hundreds of Sqoop scripts, pig scripts, Hive queries using Oozie workflows and sub-workflows.
- Loaded cache data into HBase using Sqoop.
- Experience in custom talend jobs to ingest, enrich and distribute data in MapR, Cloudera Hadoop ecosystem.
- Created lots of external tables on Hive pointed to HBase tables.
- Analyzed HBase data in Hive by creating external partitioned and bucketed tables.
- Worked with cache data stored in Cassandra.
- Injected the data from External and Internal Flow Organizations.
- Used the external tables in Impala for data analysis.
- Supported MapReduce Programs those are running on the cluster.
- Participated in apache Spark POCS for analyzing the sales data based on several business factors
- Participated in daily scrum meetings and iterative development.
Environment: Hadoop, MapReduce, Hdfs, Pig, Hive, HBase, Impala, Sqoop, Flume, Oozie, Apache Spark, Java, Linux, SQL Server, Zookeeper, Autosys, Tableau, Cassandra.
Confidential, New Jersey
Hadoop Administrator/ Developer
Responsibilities:
- Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop clusters with agile methodology.
- Monitored multiple Hadoop clusters environments using Ganglia, monitored workload, job performance and capacity planning using Cloudera Manager.
- Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
- Experienced withthrough hands-on experience in all Hadoop, Java, SQL and Python.
- Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS.
- Participated in functional reviews, test specifications and documentation review
- Performed MapReduce programs on log data to transform into structured way to find user location, age group, spending time.
- Analyzed the web log data using the HiveQL to extract number of unique visitors per day, page views, visit duration, most purchased product on website.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports by Business Intelligence tools.
- Oozie with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as MapReduce, Pig, Hive, and Sqoop) as well as system specific jobs (such as Java programs and shell scripts).
- Proactively monitored systems and services, architecture design and implementation of Hadoop deployment, configuration management, backup, and disaster recovery systems and procedures.
- Experience using Talend for ETL tools and also extensive knowledge on Netezza.
- Installed and configured Flume, Hive, Pig, Sqoop and Oozie on the Hadoop cluster, involved in Analyzing system failures, identifying root causes, and recommended course of actions.
- Documented the systems processes and procedures for future references, responsible to manage data coming from different sources.
Environment: Hadoop, HDFS, Map Reduce, Flume, Pig, Sqoop, Hive, Pig, Sqoop, Oozie, Ganglia, HBase, Shell Scripting
Confidential - Jessup, PA
Java Developer
Responsibilities:
- Created the database, user, environment, activity and class diagram for the project (UML).
- Implemented the database using oracle database engine.
- Designed and developed a fully functional generic n-tiered J2EE application platform, the environment was oracle technology driven. The entire infrastructure application was developed using oracle Developer in conjunction with oracle ADF-BC and oracle ADF- rich formats.
- Created an entity object (business rules and policy, validation logic, default value logic, security).
- Created view objects, view links, association objects, application modules with data validation rules (Exposing linked views in an application module), LOV, dropdown, value defaulting, transaction management features.
- Web application development using J2EE, JSP, Servlets, JDBC, JavaBeans, Struts, Ajax, Custom Tags, EJB, Hibernate, Ant, Junitand ApacheLog4j, Web Services, Message queue(MQ).
- Designing GUI prototype using ADF 11G GUI component before finalizing it for development.
- Experience in using version controls such as CVS, PVCS.
- Involved in consuming, producing Restful web services using JAX-RS.
- Collaborated with ETL/Informatica team to determine the necessary data modules and UI designs to support Cognos reports.
- Junit was used for unit testing for the integration testing tool.
- Created modules using task flow with bounded and unbounded.
- Generating WSDL (web services) and create work flow using BPEL.
- Created the skin for the layout.
- Made integrated testing for the application.
- Created dynamic report and using JFreechart.
Environment: Java, Servlets, JSF, Adf rich client UI framework ADF-BC (BC4J) 11g, Web Services using Oracle SOA, Oracle WebLogic.
Confidential
Software Engineer
Responsibilities:
- Developed the user interface screens using swing for accepting various system inputs such as contractual terms, monthly data pertaining to production, inventory and transportation.
- Involved in designing database connections using JDBC.
- Involved in design and development of UI using HTML, JavaScript and CSS.
- Involved in creating tables, stored procedures in Sql for data manipulation and retrieval using sqlsever2000, database modification using Sql, Pl/Sql, triggers, views in oracle.
- Used dispatch action to group related actions into a single class.
- Build the applications using Ant tool, also used eclipse as the IDE.
- Developed the business components used for the calculation module.
- Involved in the logical and physical database design and implemented it by creating suitable tables, views and triggers.
- Applied J2EE design patterns like business delegate, DAO and singleton.
- Created the related procedures and functions used by JDBC calls in the above requirements.
- Actively involved in testing, debugging and deployment of the application on WebLogic application server.
- Developed test cases and performed unit testing using JUnit.
- Involved in fixing bugs and minor enhancements for the front-end modules.
Environment: Java, HTML, Java script, CSS, Oracle, JDBC, ANT tool, SQL, Swing and Eclipse.