Sr. Big Data Architect Resume
Little Rock, AR
SUMMARY:
- Above 10+ years of experience as Big Data Engineer and Hadoop/Java Developer in Analysis, Design, Development, Deployment and Maintenance of software in Java/J2EE technologies and Big Data applications.
- Expertise in Data Development in Hortonworks HDP platform & Hadoop ecosystem tools like Hadoop, HDFS, Spark, Zeppelin, Hive, HBase, Sqoop, flume, Atlas, SOLR, Pig, Falcon, Oozie, Hue, Tez, Apache NiFi, Kafka.
- Built streaming applications using SPARK Streaming.
- Knowledge on big - data database HBase and NoSQL databases Mongo DB and Cassandra
- Expertise in Java Script, JavaScript MVC patterns, Object Oriented JavaScript Design Patterns and AJAX calls.
- Experience includes Requirements Gathering, Design, Development, Integration, Documentation, Testing and Build.
- Experience in working with MapReduce programs, Pig scripts and Hive commands to deliver the best results.
- Good Knowledge in Amazon Web Service (AWS) concepts like EMR and EC2 web services which provides fast and efficient processing of Teradata Big Data Analytics.
- Experienced in collection of Log Data and JSON data into HDFS using Flume and processed the data using Hive/Pig.
- Strong knowledge and experience in Object Oriented Programming using Java.
- Extensively worked on development and optimization of MapReduce programs, PIG scripts and HIVE queries to create structured data for data mining.
- Expertise in developing the presentation layer components like HTML, CSS, JavaScript, JQuery, XML, JSON, AJAX and D3.
- Good knowledge of coding using SQL, SQL Plus, T-SQL, PL/SQL, Stored Procedures/Functions.
- Worked on Bootstrap, Angular JS and Node JS, knockout, ember, Java Persistence Architecture (JPA).
- Hands on experience in advanced Big-Data technologies like Spark Ecosystem (Spark SQL, MLlib, SparkR and Spark Streaming), Kafka and Predictive analytics
- Knowledge of the software Development Life Cycle (SDLC), Agile and Waterfall Methodologies.
- Strong experience in developing Enterprise and Web applications on n-tier architecture using Java/J2EE based technologies such as Servlets, JSP, Spring, Hibernate, Struts, EJBs, Web Services, XML, JPA, JMS, JNDI and JDBC.
- Developed applications based on Model-View-Controller (MVC)
- Working knowledge on Oozie, a workflow scheduler system to manage the jobs that run on PIG, HIVE and SQOOP.
- Expertise in developing test cases for Unit testing, Integration testing and System testing.
- Extensively development experience in different IDE like Eclipse, Net Beans, IntelliJ and STS.
- Experienced with programming language such as C, C++, Xpath, Core Java and JavaScript.
- Good experience in Installing, Upgrading and Configuring Redhat Linux using Kickstart Servers and Interactive Installation.
- Good experience in Tableau for Data Visualization and analysis on large data sets, drawing various conclusions.
- Extensive experience in building and deploying applications on Web/Application Servers like Web logic, Web sphere, and Tomcat.
- Expertise in core Java, J2EE, Multithreading, JDBC, Hibernate, spring, Shell Scripting and proficient in using Java API's for application development.
- Good at problem-solving skills to identify areas of improvement and incorporating best practices for delivering quality deliverables
- Have good experience, excellent communication and interpersonal skills which contribute to timely completion of project deliverable well ahead of schedule.
TECHNICAL SKILLS:
Hadoop Ecosystem: Hadoop 2.7/2.5, MapReduce, Sqoop, Hive, Oozie, Pig, HDFS 1.2.4, Zookeeper, Flume, HBase, Impala, Spark 2.0/2.0.2, Storm, Hadoop (Cloudera), Hortonworks and Pivotal).
NoSQL Databases: HBase, Cassandra, MongoDB 3.2.
Web Technologies: HTML5/4, CSS3/2, JavaScript, JQuery, Bootstrap 3/3.5, XML, JSON, AJAX
Java & J2EE Technologies: Core Java, Servlets, JSP, JDBC
IDE and Tools: Eclipse 4.6, Netbeans 8.2, IntelliJ, Maven
Languages: Java, SAS, Scala and Apache Spark, SQL, PL/SQL, PIG Latin, HiveQL, Unix.
Databases: Oracle, MYSQL, DB2, MS SQL Server 2008.
Application Server: Apache Tomcat, Jboss, IBM Web sphere, Web Logic.
Web Services: WSDL, SOAP, REST
Methodologies: Agile, RAD, JAD, RUP, Waterfall & Scrum.
PROFESSIONAL EXPERIENCE:
Confidential, Little Rock, AR
Sr. Big Data Architect
Responsibilities:
- Working as a Big Data Architect for providing solutions for big data problem
- Building scalable distributed data solutions using Hive, Python, Spark, Informatica Big data Edition and Hadoop.
- Design, Architect, and help Maintain scalable solutions on the big data analytics platform for enterprise module.
- Worked on NOSQL databases such as MongoDB, HBase and Cassandra to enhance scalability and performance.
- Integrated Hadoop frameworks/technologies such as Hive and HBase to further operational and analytical experience.
- Analyzed large data sets (structured and unstructured) using Hive queries, R Programming & Pig Scripts
- Accomplished multiple Prototypes and POCs for the product & modules.
- Analyze multiple sources of structured and unstructured data to propose and design data architecture solutions for scalability, high availability, fault tolerance, and elasticity.
- Worked on importing impala to Python.
- Worked on AWS provisioning EC2 Infrastructure and deploying applications in Elastic load balancing.
- Designed the real-time analytics and ingestion platform using Storm and Kafka.
- Responsible for data movement from client library and relational database to HDFS using some Linux job and Sqoop.
- Incremental data movement using Sqoop and Oozie jobs.
- Implement Big Data systems in distributed cloud environment (AWS) using Amazon EMR
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
- Assisted in created Fact and Dimension table implementation in Star Schema model based on requirements.
- Worked on installing cluster, commissioning & decommissioning of Data Nodes, Name Node recovery, capacity planning and slots configuration.
- Involved in dealing customer portfolio management and using the right data to provide recommendations.
- Involved in migration of ETL processes from Oracle to Hive to test the easy data manipulation
- Participated in Rapid Application Development and Agile processes to deliver new cloud platform services.
- Working on SPARK to convert existing MAP Job (Avro) using SPARK Core with Scala
- Responsible for Design EDW Application Solutions & deployment, optimizing processes, definition and implementation of best practice
- Managed and lead the development effort with the help of a diverse internal and overseas group.
Environment: Java, Scala, XML, Oracle BDA (Cloudera), Cloudera Manager, Hadoop MapReduce, YARN, OOZIE, Flume, KAFKA, SPARK Core, SQL, HP Vertica, Teradata, Tableau, HIVE
Confidential, Seattle, WA
Sr. Big data/Hadoop Engineer
Responsibilities:
- Worked on importing data from various sources and performed transformations using MapReduce, Pig to load data into HDFS.
- Worked with cross functional teams to design and develop a Big Data platform.
- Loaded the data from the different Data sources like (Teradata and DB2) into HDFS using SQOOP and load into Hive tables, which are partitioned.
- Developed bash scripts to bring the Tlog files from ftp server and then processing it to load into hive tables.
- Inserted Overwriting the HIVE data with HBase data daily to get fresh data every day and used Sqoop to load data from DB2 into HBASE environment...
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Scala and have a good experience in using Spark-Shell and Spark Streaming.
- Designed, developed and maintained Big Data streaming and batch applications using Storm.
- Configured Sqoop jobs to import data from RDBMS into HDFS using Oozie workflows.
- Created Hive, Phoenix, HBase tables and HBase integrated Hive tables as per the design using ORC file format and Snappy compression.
- Developed Oozie Workflows for daily incremental loads, which gets data from Teradata and then imported into hive tables.
- Involved in SDLC Requirements gathering, Analysis, Design, Development and Testing of application developed using AGILE methodology.
- Developed multiple scripts for analyzing data using Hive and Pig and integrating with HBase.
- Involved in loading data from UNIX file system to HDFS using Flume and Kettle and HDFS API.
- Implementation of a log producer in Scala that watches for application logs, transform incremental log and sends them to a Kafka and Zookeeper based log collection platform.
- Worked on Sequence files, RC files, Map side joins, bucketing, partitioning for Hive performance enhancement and storage improvement.
- Developed Batch and Real Time Processing jobs using spark. Developed Spark Streaming applications for Real Time Processing.
- Created HBase tables to store variable data formats coming from different portfolios.
- Performed real time analytics on HBase using Spark API and Rest API.
- Analyzed the customer behavior by performing click stream analysis and to ingest the data used flume.
- Worked with QA and DevOps teams to troubleshoot any issues that may arise during production
- Created Cassandra tables to load large sets of structured, semi-structured and unstructured data coming from Linux, NoSQL and a variety of portfolios.
- Developed pig scripts to transform the data into structured format and it are automated through Oozie coordinators.
- Used Splunk to captures, indexes and correlates real-time data in a searchable repository from which it can generate reports and alerts.
Environment: Hadoop, HDFS, Spark, Strom, Kafka, Map Reduce, Hive, Pig, Sqoop, Oozie, DB2, Java, Python, Splunk, UNIX Shell Scripting.
Confidential, Novi, MI
Sr. Hadoop/Java Developer
Responsibilities:
- Developed Big Data Solutions that enabled the business and technology teams to make data-driven decisions on the best ways to acquire customers and provide them business solutions.
- Involved in installing, configuring and managing Hadoop Ecosystem components like Hive, Pig, Sqoop, Kafka and Flume.
- Integrated Apache Storm with Kafka to perform web analytics. Uploaded click stream data from Kafka to HDFS, HBase and Hive by integrating with Storm.
- Developed MapReduce programs to cleanse and parse data in HDFS obtained from various data sources and to perform joins on the Map side using distributed cache.
- Implemented daily workflow for extraction, processing and analysis of data with Oozie.
- Used the RegEx, JSON and Avro SerDe's for serialization and de-serialization packaged with Hive to parse the contents of streamed log data.
- Used Sqoop extensively to import data from RDMS sources into HDFS. Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, and loaded final data into HDFS
- Provisioning of Cloudera Director AWS instance and adding Cloudera manager repository to scale up Hadoop Cluster in AWS.
- Involved in loading data from UNIX file system to HDFS using Flume and Kettle and HDFS API.
- Involved in managing and reviewing Hadoop log files.
- Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with reference tables and historical metrics.
- Involved in running Hadoop streaming jobs to process terabytes of text data.
- Used spark machine learning technique implemented in Scala.
- Developed the technical strategy for Spark integrated for pure streaming and more general data-computation needs.
- Configured Spark Streaming to receive real time data from the Kafka and store the stream data to HDFS.
- Developed the technical strategy for Spark integrated for pure streaming and more general data-computation needs.
- Worked on implementation of a log producer in Scala that watches for application logs, transform incremental log and sends them to a Kafka and Zookeeper based log collection platform.
- Designed and Developed ETL jobs using Talend Big Data ETL.
- Implemented Fair schedulers on the Job tracker to share the resources of the Cluster for the Map
- Implemented Spark using Scala and SparkSQL for faster testing and processing of data.
Environment: Hadoop (Cloudera), HDFS, Map Reduce, Kafka, Hive, Scala Pig, Sqoop, Oozie, AWS, Solaris, DB2, SparkSQL, Spark Streaming, Spark, UNIX Shell Scripting.
Confidential, Atlanta, GA
Sr. Java/J2EE Developer
Responsibilities:
- Used Core Java concepts such as multi-threading, collections, garbage collection and other JEE technologies during development phase and used different design patterns.
- Developed The UI using Java, Java2EE, HTML, and CSS for interactive cross browser functionality and complex user interface.
- Used Java script to simplify the code. Implemented Database handling, multithreading, Synchronization, and Communication.
- Developed integration services using SOA, Mule ESB, Web Services, SOAP, and WSDL.
- Used Eclipse IDE and deployed the application on Apache Tomcat server.
- Integrated Spring & Hibernate frameworks to develop end to end application.
- Used spring framework's JMS support for writing to JMS Queue, Hibernate Dao Support for interfacing with the database and integrated spring with JSF.
- Developed & consumed the web services using Apache CXF, JAX-WS, AXIS, WSDL, and SOAP.
- Developed Java and J2EE applications using Rapid Application Development (RAD), Eclipse.
- Designed user-interface and checking validations using JavaScript. Involved in the JMS connection pool and the implementation of publish and subscribe using Spring JMS.
- Designed, developed and maintained the data layer using the ORM framework in Hibernate.
- Developed the J2EE application based on the Service Oriented Architecture.
- Created and injected spring services, spring controllers and DAOs to achieve dependency injection and to wire objects of business classes.
- Used Spring Inheritance to develop beans from already developed parent beans.
- Used SOAP Lite module to communicate with different web-services based on given WSDL.
- Involved in designing, coding, debugging, documenting and maintaining a number of applications.
- Assisted in writing the SQL scripts to create and maintain the database, roles, users, tables in SQL Server 2008.
- Used Hibernate Transaction Management, Hibernate Batch Transactions, and cache concepts.
- Created complex SQL Queries, PL/SQL Stored procedures, Functions for back end.
- Developed various generic JavaScript functions used for validations.
- Developed screens using HTML, CSS, jQuery, JSP, JavaScript, AJAX and ExtJS.
- Created user-friendly GUI interface and Web pages using HTML, AngularJS, JQuery and JavaScript.
- Used Log4j utility to generate run-time logs.
- Involved in a full life cycle Object Oriented application development - Object Modeling, Database Mapping, GUI Design
Environment: J2EE, Spring3.0, Spring MVC, ZeroMQ, Hibernate3.0, jQuery, JSON, JSF, Servlets, JDBC, AJAX, Web services, SOAP, XML, Java Beans, XStream, JavaScript, Oracle 10g, IBM RAD, WebSphere, SQL Server 2008, SQL scripts, HTML, AJAX, Eclipse.
Confidential, Montgomery, AL
Java/J2EE Developer
Responsibilities:
- Designed and developed entire application implementing MVC Architecture.
- Developed frontend of application using Bootstrap (Model, View, and Controller), Java Script, and Angular.js framework.
- Worked on technologies such as HTML, CSS, JavaScript, Core Java, JDBC and JSP.
- Worked on eclipse with Tomcat Apache for development.
- Designed various user stories using UML diagrams and Class diagrams based on OOPS concepts.
- Participate in designing Web service (REST Service) framework in support of the product.
- Prioritized and scheduled daily activities for the department and assigned specific duties to each team member.
- Resolved production errors and deployed applications for end users.
- Assisted in responding to error reports and compiling a summary each month for management.
- Worked on High level and Low-level design documents.
- Used Log4J utility to log error, info and debug messages.
- Expertise in working with Log4j for logging and JUnit for unit and integration testing.
- Used Spring framework for implementing IOC/JDBC/ORM, AOP and Spring Security.
- Involved in Java, J2ee, Spring, Restful Web Services, WebSphere 5.0/6.0 in a fast paced development environment.
- Extensively worked with collections classes like ArrayList, HashMap, and Iterator etc.
- Involved with Spring IOC concepts to integrate Hibernate Dao classes with Struts Action classes.
- Extensively developed stored procedures, triggers, functions and packages in oracle SQL, PL/SQL.
- Written independent JavaScript, CSS files and reused in UI pages.
- Developed persistence layer using ORM Hibernate for transparently store objects into database.
- Developed clickable prototypes in HTML, DHTML, Photoshop, CSS and JavaScript.
- Used JUNIT to write repeatable tests mainly for unit testing.
- Participated in deployment of applications on Weblogic Application Server.
- Used SVN for version controlling.
- Analyzed and fine Tuned RDBMS/SQL queries to improve performance of the application with the database.
- Creating XML based configuration, property files for application and developing parsers using JAXP, SAX, and DOM technologies.
- Proficient in developing applications having exposure to Java, JSP, UML, Servlets, Struts, Swing DB2, Oracle (SQL, PL/SQL), HTML, Junit, JSF, Java Script, CSS.
Environment: Java, JavaScript, jQuery, JSP, Servlets, HTML, Oracle 9i, Junit, MYSQL, JSP, UML, Hibernate, Spring, Struts, WebSphere 5.0/6.0, SOAP, JSF, HTML, CSS, Web Services (SOAP, RESTFUL)