We provide IT Staff Augmentation Services!

Big Data Developer Resume

2.00/5 (Submit Your Rating)

San Francisco, CA

SUMMARY:

  • 8 years of total IT experience with 4+ years as a dedicated, professional Big data developer with an excellent background in BigData components like HDFS, Map Reduce, Apache Pig, Hive, Sqoop, HBase, Spark Core and Oozie . 4years of experience in developing Object Oriented, Enterprise and Web - based Applications in Java.
  • In-depth understanding of Data Structures and Algorithms. Experience in Hadoop MapReduce programming, Pig Latin, Hive SQL and HDFS.
  • Hands on experience in writing MapReduce jobs in Java, Pig and Python. Skilled in managing and reviewing Hadoop log files.
  • Experience in different Hadoop distributions like Cloudera (CDH5.6) and Horton Works Distributions (HDP)
  • Hands on experience in installation, configuration, supporting and managing Hadoop Clusters using Apache Horton works and Cloudera.
  • Expert in working with Hive data warehouse tool-creating tables, data distribution by implementing partitioning and bucketing, writing and optimizing the HiveQL queries.
  • Experienced in loading data to hive partitions and creating buckets in Hive. Experience in using Apache Sqoop to import and export data to and from HDFS and Hive.
  • Excellent understanding and knowledge of NOSQL databases like HBase. Hands on experience in dealing with NoSQL database Hbase. Experienced in working HIVE scripts for analyst requirements for analysis.
  • Worked on importing and exporting RDBMS data into HDFS and Hive using Sqoop. Developing predictive analytic using Apache Spark Scala APIs.
  • Experience in Management, supporting and monitoring Hadoop cluster using various distributions such as AWS Service console, HORTON WORKS and Cloudera.
  • Experience with Oozie workflow engine to run multiple Hive and Pig jobs independently with time and data availability. Developed Scala scripts, UDF's using both Data frames/SQL and RDD/MapReduce in Spark for Data Aggregation, queries and writing data back into RDBMS through Sqoop.
  • Involved in various phases of Software Development Life Cycle (SDLC) as requirement gathering, modeling, analysis, architecture design & development.
  • Expert in implementation and maintenance of application software in web based enterprise environment, distributed tier architecture.
  • Experience in writing Shell Scripts (bash, SSH, Perl).Strong Java/JEE application development background with experience in defining technical and functional specifications
  • Proficient in developing strategies for Extraction, Transformation and Loading (ETL) mechanism and UNIX shell scripting.
  • Experience developing custom tags using JSP and has strong programming skills using Java/J2EE technologies such as JProbe, Java, Spring, Hibernate, JPA, Lucerne, AngularJS, XML, JavaScript, Node.js, JSP, JDBC, Struts, Servlets, JAX-WS, JAX-RS.
  • Strong experience in developing predictive analytic using Apache Spark Scala API's. Knowledge in Spark Core, Streaming, Data Frames and SQL, Glib, Graph. Having great experience in developing web applications using Spring MVC, Struts MVC Frameworks.
  • Expertise in developing the presentation layer components like HTML, CSS, JavaScript, JQuery, XML, JSON, AJAX and D3.
  • Experienced in source control repositories viz. SVN, GitHub. Experienced in detailed system design using use case analysis, functional analysis, modeling program with class and sequence, activity and state diagrams using UML.
  • Good understanding of NoSQL Data bases and hands on work experience in writing application on No SQL databases. Designed and implemented data tables as per data model.
  • Worked with Data-Warehouse Architecture and Designing Star Schema, Snowflake Schema, Fact and Dimensional Tables, Physical and Logical Data Modeling. Designed Mapping documents for Big Data Application.
  • Expertise in successful implementation of projects by following Software Development Life Cycle, including Documentation, Implementation, Unit testing, System testing, build and release.
  • Experience in dealing with databases Oracle 9i/10g, MySQL, SQL Server. Experience with Agile and Extreme Programming methodologies.
  • Experience in working on extracting data from data warehouse (TeraData) on to the Spark RDD's
  • Experience in analyzing data using HiveQL, Pig and Map Reduce. Experience in job work-flow scheduling and monitoring tools like Oozie.
  • Developed a Data lake which serves as a Base layer to store and do analytics for Developers. provide services for the developers, Install their custom software, upgrade hadoop components, solve their issues, and help them troubleshooting their long running jobs, and I have worked on Spark 1.5.2 Spark Streaming, Spark Glib, Spark Graph, Spark SQL, Spark Data.
  • Experience in developing, designing and maintaining web applications using Core Java, Web Services (REST/SOAP), JSP/Servlets, Spring MVC, Spring Batch.
  • Experienced in implementing Multithreaded, server-side applications using Java, Java EE (Spring Framework (Core, MVC), Servlets, JSP, JDBC, Web services (SOAP/REST).
  • Experienced in interacting with business teams to understand their information needs, ensuring that BI solutions are an excellent fit to their evolving requirements.
  • Extensive experience with design and development of J2EE based applications involving technologies such as Java Server Pages (JSP), Java Messaging Service (JMS), Java Data Base Connectivity (JDBC), Java Naming and Directory Interface (JNDI).
  • Configured the compression codec’s to compress the data in the Hadoop cluster, Different queues on the cluster depending on the demand of the map -reduce job. Experience in working with large-scale integration projects.

TECHNICAL SKILLS:

Hadoop/Big Data Technologies: HDFS, Map Reduce, HBase, Catalog, Pig, Hive, Sqoop, Spark, Impala, Cassandra, Oozie, YARN, Flume, Kafka, Hadoop Distributions Cloudera, Horton Works

Programming / Scripting Language: Java, Python, SQL, PL/SQL, Shell Scripting, Storm, PIG Latin, JSP & Servlets, JavaScript, XML, HTML, Python

Frameworks: MVC, Spring, Struts, Hibernate,JSF, EJB, JMS

Web Technologies: HTML, XML, Ajax, SOAP, Java Script, CSS, JSP

Databases: SQL Server, MySQL,Oracle11g,My SQL,Oracle 8i/9i, MySQL

Database Tools: MS SQL Server, Oracle, My SQL,Splunk,Eclipse,Oracle SQL Developer

Other Concepts: OOPS, Data Structures, Algorithms, Software Engineering.

NoSQL Databases: HBase, Cassandra

Application Server: Apache Tomcat, JBoss, Web Logic, Web Sphere, Web logic, Jetty.

Methodologies: Scrum, Agile, Waterfall, SDLC,Agile-model.

PROFESSIONAL EXPERIENCE:

Big Data Developer

Confidential, San Francisco, CA

Responsibilities:

  • Collaborate with data science teams in understanding data points, acquiring them and loading them into HDFS from disparate data sources.
  • Implemented ETL process to aggregate sales data from retail sites, social media, blogs and reports, to curate and validate it and build pipelines.
  • Crawled clusters, catalogued and inferred meaning for each field to later crowd source and propagate using automation and provision to Hive tables.
  • Implemented Item categorization engine for cataloguing.
  • In the data duration process, used Natural language processing tools and computer vision for processing SKU (Stock keeping Unit)
  • Worked with spark streaming for anomaly detection, brand detection and dynamic taxonomy.
  • Implemented Kafka to de-couple data pipelines and help data flow in and out of Kafka through main producers, spark streaming engines on HDFS, NoSQL databases and analytic warehouse.
  • Developed multiple MapReduce programs in java for data extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV and other file formats.
  • Developed Scala scripts, Data frames and RDD in Spark for Data Aggregation, queries and writing data into HBase database. Loading and transforming of large sets of structured, semi structured and unstructured data.
  • Worked extensively with Spark core API’s like Sparks, Glib & Graph.
  • Developing Apache Spark Scala API's, to make the data easily accessible to the Data Science teams, using Data Frames and Datasets.
  • Developed and running Map-Reduce jobs on YARN and Hadoop clusters to produce daily and monthly reports.
  • Debugging/Troubleshoot issues on UDF's in Hive Scheduling and managing jobs on a Hadoop cluster using Oozie work flow.
  • Continuously tuned Hive UDF’s for faster queries by employing partitioning and bucketing.
  • Expertise with Working on Hibernate for mapping the java objects to relational database and SQL queries to fetch the data, insert and update the data from database.
  • Implemented Spark using Scala and Spark SQL for faster testing and processing of data.
  • Performance tuning of Spark jobs using Higher level Spark core API’s for code optimization.
  • Wrote naïve bay’s and Decision tree for Customer data classification.

Environment: Horton Works Hadoop distribution, HDFS, Spark core API’s (Sparks, Glib, Streaming, Graph), Pig, Hive, Oozie, flume, Sqoop, LINUX, Kafka

Hadoop Developer

Confidential, Cleveland, OH

Responsibilities:

  • Responsible for manage data coming from different sources. Storage and Processing in Hue covering all Hadoop ecosystem components.
  • Involved in creating Hive tables, and loading and analyzing data using Hive queries. Developed Simple to complex MapReduce Jobs using Hive and Pig.
  • Experience of working on Red-Gate SQL Response and SQL Data Compare used for monitoring Servers and SQL objects.
  • Experience on Snap Manager for a project to backup and restore databases. Involved in system design and development in Core Java using Collections, Multithreading. Moving data between different AWS services and on premise data sources using Amazon Data Pipeline.
  • Experience on MapReduce programs on Amazon Elastic MapReduce framework by using Amazon S3 for Input and Output.
  • Experience in different Hadoop distributions like Cloudera (CDH3 & CDH4) and Horton Works Distributions (HDP) and Elastic MapReduce (EMR)
  • Implemented Spark using Scala and Spark SQL for faster testing and processing of data. Created EC2 virtual servers in VPC also RDS in VPC. Used the Teradata fast load/Multi load utilities to load data into tables.
  • Automated Build upon check in with Jenkins CI (continuous Integration). Implemented UDFs in java for hive to process the data that can't be performed using Hive inbuilt functions.
  • Developed simple to complex UNIX shell/Bash scripting scripts in framework developing process. Involved in writing Flume and Hive scripts to extract, transform and load the data into Database.
  • Used Oozie to orchestrate the MapReduce jobs and worked with Catalog to open up access to Hive's Metastore.Used Web services for sending and getting data from different applications using Restful.
  • Developed Kafka producer and consumers, HBase clients, Spark and Hadoop MapReduce jobs along with components on HDFS, Hive. Responsible for creation of unit test cases and test the same also involved to review test case of others and test the other modules on the bases of test case also close the defect found in unit test case.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data. Using Mahout MapReduce to parallelize a single iteration. Responsible for the implementation of application system with core Java and spring framework.
  • Setting up Amazon EMR. Installation, management and monitoring of Hadoop cluster using Cloudera Manager.
  • Experience in developing Pig Latin and HiveQL scripts for Data Analysis. Experience with various performance optimizations like using distributed cache for datasets, partitioning, query optimization in Hive.
  • Automation script to monitor HDFS and Hbase through corncobs. Using AWS Cloud Formation templates to deploy other AWS service. Securing environment using AWS VPC.
  • Translation of Business Processes into data mappings for building the Data Warehouse. Created Parquet Hive tables with Complex Data Types corresponding to the Avro Schema.
  • Processed data with Hive and Teradata, and developed web applications using Java and Oracle SQL. Develop hive/pig/Red shift scripts and hook them to Oozie for automation.

Environment: Cloud era’s Hadoop Distribution 5.6,Core Java, Jenkins, Teradata, Apache Parquet, Git, UNIX/Linux Shell Scripting, Hbase.

Java Developer with Hadoop

Confidential

Responsibilities:

  • Developing solutions to requirements, enhancements and defects. Involved in requirements Design, Development, and System Testing. Developing UI screens using JSP, Servlets.
  • Developed J2EE Backing Beans, Action Classes, and Action Mapping and Application facades and hibernate classes to retrieve and submit using the JSF Framework.
  • Implemented the JSF package with MVC framework. Created multiple view access for access control between administrators and Adjusters.
  • Developed and utilized J2EE Services SERVLETS, JSP components. Implemented Action class to encapsulate the business logic. Used Struts framework for developing applications.
  • Gathered business requirements and wrote functional specifications and detailed design documents.
  • Wrote Mapreduce jobs for aggregation, joins and analytics.
  • Built an analytics back-end for US stocks to generate daily dash boards for corporate events, stocks on the move, price alerts based on predetermined cutoffs etc. Market data is sourced from Google finance using web APIs.
  • Mentored a group of application developers, assigned responsibilities, elaborated use cases, managed project schedules, and module targets.
  • Created the UI interface using JSP, JSF, JavaScript and JQuery. Worked on JavaScript for dynamic content to pages; utilized CSS for the front end.
  • Worked on the modernization of a legacy and outsourced UI. Technologies used were Backbone.js, node.js.
  • Built and deployed Java applications using MVCII architecture using Struts 2, designed and developed Servlets, JSP for controller, View layers respectively where Servlets processed requests and transferred control to appropriate JSP.
  • Worked with the development of controller layer using MVC type 2 Framework. Enhanced the application performance by introducing Multi-threading using thread-state model and priority-based thread scheduler in Java.
  • Stored Procedures, database triggers were used at all levels. Communicating across the team about the processes, goals, guidelines and delivery of items.
  • Used Ibatis to populate the data from the Database. Used various design patterns like Singleton, Facade, Command, Factory, DAO.
  • Used Object Oriented Application Design (OOA/D) for deriving objects and classes. Data-retrieval from back-end database using Data Source from JDBC Drivers.
  • Designed and developed JSP Pages using Struts Frame work and Tag libraries. Involved in implementation of Spring MVC framework and developed DAO and Service layers.

Environment: J2EE (Java 1.4, JSP, SERVLETS), Eclipse, MS-SQL Server, T-SQL, Struts Framework, Web Logic, Tomcat Web Server, XML, JDBC, JNDI, ANT, Windows XP, JavaScript, UML, Horton Works Hadoop Distribution.

Java Developer

Confidential

Responsibilities:

  • Involved in the design and implementation of the architecture for the project using OOAD, UML design patterns.
  • Developed Action class and Action Form for business logic with support of spring framework and Presentation tier.
  • Involved in design and development of server side layer using XML, JSP, JDBC, JNDI, EJB and DAO patterns using Eclipse IDE.
  • Involved in multi-tiered J2EE design utilizing Spring Inversion of Control (IOC) architecture and Hibernate.
  • Designed the logical and physical data model, generated DDL scripts, and wrote DML scripts for Oracle database. The creations of the Java Objects were automated using Ibatis.
  • Configured the controllers, and different beans such as Handler Mapping, View Resolver etc. Involved in developing Action Servlets classes and Action classes. Created Hibernate configuration files, Struts Application context file.
  • Designed and developed various modules of the application with frameworks like Spring MVC, Web Flow, architecture and Spring Bean Factory using IOC, AOP concepts.
  • Used Angular.JS for developing Single Paged Applications and Bootstrap for responsive web. Used Log4j for logging and debugging.
  • Involved in Documentation and Use case design by using UML modeling includes development of Class diagrams, Sequence diagrams and Use Case diagrams.
  • Worked with Unified Modeling Tools (UML) in designing Use Cases, Activity flow diagram, Class diagrams, Sequence and Object Diagrams.
  • Using Spring-AOP module implemented features like logging, user session validation. Used Hibernate3 with annotation to handle all database operations.
  • Worked on generating the Web Services classes by using Service oriented architecture (SOA).Used JSP and Servlets for server side transactions. Worked in deadline driven environment with immediate feature release cycles.
  • Developed the Java Code using Eclipse as IDE.Utilize Struts (MVC) framework and developed JSP pages, Action Servlets and XML based action-mapping files for web tier.
  • Configuration of Tomcat 4.1 for the application on Win NT server. Used Java script for validation of page data in the JSP pages. Responsible for code version management and unit test plans.

Environment: Java Script,UML and Application Servers like Apache Tomcat 5.x 6.0, Jboss 4.0 and Methodologies: Agile, UML, Design Patterns, Oracle 8i/9i.

We'd love your feedback!