Sr. Big Data/hadoop Architect Resume
Pineville, NC
SUMMARY:
- Well round professional with 10+ years of experience in design, development, maintenance, and support of Java/J2EE and Big Data/Hadoop applications.
- Experience on major Hadoop ecosystem's projects such as Pig, Hive, HBase and monitoring them with Cloudera Manager, AWS, & Hortonworks.
- Hands on experience working on NoSQL databases including HBase and its integration with Hadoop cluster
- Knowledge on implementing Big Data in Amazon Elastic MapReduce (Amazon EMR) for processing, managing Hadoop framework dynamically scalable Amazon EC2 instances.
- Knowledge of Machine Learning algorithms like Regression, K - Means, and SVM and Cyber Security concepts like cryptography, Access Control, Data Security in Linux and Unix.
- Experience in working in environments using Agile(SCRUM), RUP and Test Driven development methodologies.
- In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, NameNode, DataNode.
- Experience with Design patterns like Singleton, Data Access Object, MVC Pattern and Agile, SCRUM methodology.
- Involvement in creating custom UDFs for Pig and Hive to consolidate strategies and usefulness of Python/Java into Pig Latin and HQL (HiveQL).
- Good knowledge of coding using SQL, SQL Plus, T-SQL, PL/SQL, Stored Procedures/Functions.
- Worked on Bootstrap, AngularJS and NodeJs, knockout, ember, Java Persistence Architecture (JPA).
- Expertise in using java performance tuning tools like JMeter and Jprofiler and LOG4J for logging.
- Extensive Experience in using MVC (Model View Controller) architecture for developing applications using JSP, JavaBeans, Servlets.
- Hands on experience in installing, configuring and using ecosystem components like Hadoop Map Reduce, HDFS, Pig, Hive, Sqoop, Python, Scala and Spark.
- Extensive experience in multiple Java and J2EE technologies such as Servlet, JSP, JSTL, Spring, Struts, Sitemesh, IBatis, Hibernate and JPA, XML, XSD, HTML, JavaScript, JQuery, AJAX, JUnit, WSDL, SOAP, Restful Web Services (Restlet), ActionScript 3.0
- Expertise with Spring Framework using components like MVC, Transactions, ORM and JDBC. Also used Hibernate ORM, JSF and Struts MVC frameworks.
- Extensive work experience in designing and developing UI using JSP, JSTL, HTML, CSS, JQuery, AJAX, JavaScript and GWT
- Expertise in developing XML documents with XSD validations, SAX, DOM, JAXP parsers to parse the data held in XML documents.
- Good in writing ANT scripts for development and deployment purposes.
- Experience in GUI/IDE Tool using Eclipse, JBuilder and Netbeans.
- Experience with application and web servers such as WebLogic 12c, Tomcat and Jetty.
- Proficiency in developing complex SQL queries, Stored Procedures on various databases like Oracle, PostgreSQL and SQL Server.
- Experience and strong knowledge on implementation of Spark Core -Spark Streaming, Spark SQL, MLLib.
TECHNICAL SKILLS:
Big Data Ecosystem: HDFS, MapReduce, YARN, Hive, HBase, Impala, Sqoop, Oozie, Apache Cassandra, Flume, Spark, Splunk, Kafka, Avro.
Hadoop Clusters: Cloudera CDH 4/5, Hortonworks HDP 2.3/2.4, Amazon Web Services (AWS)
Java & J2EE Technologies: JDBC, Spring, Servlets, Struts, JSP, Java, Web Services using SOAP, REST, WSDL, HTML, JavaScript, JQuery, XML, XSL, XSD, JSON, CSS, Hibernate.
Languages: XML, XSL, UML, HTML, DHTML, WML, SQL, PL/SQL, JQuery.
Databases: MySQL, DB2, Oracle 12c/11g, MS- SQL Server 2014/2016, Mongo DB, Cassandra, HBase, MS Access
IDE: Eclipse, Net Beans, IntelliJ.
Development Tools: Maven, TOAD, SQL Workbench, Ant
Web/Application Servers:: WebLogic, IBM WebSphere Application Server, Tomcat, JBoss and Apache Web Server.
Methodologies: Agile, RAD, JAD, RUP, Waterfall & Scrum.
PROFESSIONAL EXPERIENCE:
Confidential,Pineville,NC
Sr. Big Data/Hadoop Architect
Responsibilities:
- Working as a Big Data Architect for providing solutions for big data problem.
- Worked in Agile development environment in sprint cycles of two weeks by dividing and organizing tasks. Participated in daily scrum and other design related meetings.
- Design, Architect, and help Maintain scalable solutions on the big data analytics platform for enterprise module.
- Created and maintained technical documentation for launching Hadoop clusters and for executing Hive queries and Pig Scripts.
- Created real time data ingestion of structured and unstructured data using Kafka and Spark streaming to Hadoop and MemSQL. populate the data into dimensions and fact tables, efficiently involved in creating Talend Mappings.
- Started using Apache NiFi to copy the data from local file system to HDP.
- Imported data from AWS S3 and into Spark RDD and performed transformations and actions on RDD's.
- Migrated physical data center environment to AWS also designed, built, and deployed a multitude applications utilizing almost all of the AWS stack (EC2, S3, RDS, )
- Implement solutions for ingesting data from various sources and processing the Data utilizing Big Data Technologies such as Hive, Spark, Pig, Sqoop, HBase, Map reduce, etc.
- Use Input and Output data as delimited files into HDFS using Talend Big data studio with different Hadoop Component like Hive, Pig, Spark.
- Developed Scala scripts, UDFs using both Data frames/SQL and RDD/MapReduce in Spark for Data Aggregation, queries and writing data back into OLTP system through Sqoop.
- Involved in loading and transforming large sets of structured, semi structured and unstructured data from relational databases into HDFS using Sqoop imports.
- Install, configured, and operate Zookeeper, Pig, Falcon, Sqoop, Hive, HBase, Kafka, and Spark for business needs.
- Involved in MapR Converged Data Platform was built with the idea of data movement in mind, with a real-time.
- Create a table inside RDBMS, insert some data after load the same table into HDFS, Hive using Sqoop.
- Work with Business stakeholder and translate Business objectives, requirements into technical requirements and design.
- Defined the application architecture and design for Big Data Hadoop initiative to maintain structured and unstructured data; create reference architecture for the enterprise.
- Identify data sources, create source-to-target mapping, storage estimation, provide support for Hadoop cluster setup, data partitioning.
- Developed scripts for data ingestion using Sqoop and Flume, Spark SQL and Hive queries for analyzing the data, and Performance optimization
- Responsible for developing data pipeline with Amazon AWS to extract the data from weblogs and store in Amazon EMR.
- Wrote DDL and DML files to create and manipulate tables in the database
- Developed the UNIX shell/Python scripts for creating the reports from Hive data.
- Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms
- Responsible for writing Hive Queries for analyzing data in Hive warehouse using Hive Query Language (HQL).
- Analyzed data using Hadoop components Hive and Pig and created tables in hive for the end users
- Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis
Environment: Hadoop, MLlib, MapReduce, MySQL, MongoDB, HDFS, Yarn, Hive, Pig, Sqoop 1.6, Flume, Amazon Web Services EC2, Hive, Spark, Scala, Pig, jQuery, Spring, JUnit, XML, Python.
Confidential,Centennial,CO
Sr. Big Data/Hadoop Architect
Responsibilities:
- Involved in Various Stages of Software Development Life Cycle (SDLC) deliverables of the project using AGILE Software development methodology.
- Responsibilities included resource management, client meetings, implementation and design, coordinating off shore teams, budgetary analysis and risk management.
- Followed Agile-Scrum project development methodology for implementation of projects, part of the daily scrum meetings and sprint meetings.
- Cluster management for Hadoop on Cloud, AWS Instances.
- Worked on customizing Map Reduce code in Amazon EMR using Hive, Pig, Impala frameworks.
- Analyzed requirements and designed data model for Cassandra, Hive from the current relational database in Oracle and Teradata.
- Loaded the customer profiles data, customer spending data, credit from legacy warehouses onto HDFS using Sqoop.
- Supporting Data analytics team providing various sources data in Hive using Spark SQL.
- Setup Architecture for big data capture, representation, information extraction and fusion.
- Created a Hive aggregator to update the Hive table after running the data profiling job. Analyzed large data sets by running Hive queries.
- Creating Hive tables, loading with data and writing Hive queries that will run internally in map reduce way.
- Extracted data from Teradata to HDFS using Sqoop.
- Analyzed the data by performing Hive queries.
- Implemented Partitioning, Dynamic Partitioning and Bucketing in Hive.
- Developed Hive queries to process the data and generate the data cubes for visualizing.
- Built reusable Hive UDF libraries for business requirements, which enabled users to use these UDF's in Hive Querying.
- Provide mentorship and guidance to other architects to help them become independent
- Provide review and feedback for existing physical architecture, data architecture and individual code
- Loading data from local file system (Linux) to HDFS.
- Debug and solve issues with Hadoop as on-the-ground subject matter expert. This could include everything from patching components to post-mortem analysis of errors.
- Worked on Informatics Power Center, Informatics Power Exchange for Metadata Analysis
- Written Hive UDF to sort Structure fields and return complex data type.
- Modeled Hive partitions extensively for data separation and faster data processing and followed Pig and Hive best practices for tuning.
- Exported the patterns analyzed back to Teradata using Sqoop.
- Implemented a script to transmit sys print information from Oracle to HBase using Sqoop.
- Loaded JSON-Styled documents in NoSQL database like MongoDB and deployed the data in cloud storage service, Amazon Redshift.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports.
Environment: Hortonworks, Hadoop, Hive, Sqoop, HBase, MapReduce, HDFS, Pig, Cassandra, Java, Oracle 11g/10g, FileZilla, Unix Shell Scripting, Toad, SQL, MySQL Workbench, XML, No-SQL, Tableau, Flume
Confidential,Minnetonka,MN
Sr. Java/Hadoop Developer
Responsibilities:
- Responsible for building scalable distributed data solutions using Hadoop
- Used Sqoop to load data from Oracle Database into Hive
- Designing and implementing Java/MapReduce utilities for analyzing aggregated data.
- Extend Hive and Pig core functionality by writing custom UDFs using Java.
- Developed MapReduce programs to cleanse the data in HDFS obtained from multiple data sources
- Implemented various Pig UDF's for converting unstructured data into structured data
- Involved in creating Hive tables as per requirement defined with appropriate static and dynamic partitions
- Used Hive to analyze the data in HDFS to identify issues and behavioral patterns
- Used Spring MVC architecture and Hibernate ORM to map the Java classes and oracle SQL server database.
- Created alter, insert and delete queries involving lists, sets and maps in DataStax Cassandra.
- Involved in scheduling Oozie workflow engine to run multiple Hive and Pig jobs
- Developed Use Case diagrams, business flow diagrams, Activity/State diagrams.
- Wrote custom Java Web Services for various functionalities of the application Adopted J2EE design patterns like MVC and Business Facade.
- Worked on JAX-RS to develop Restful web service and enterprise integration patterns using Apache CXF.
- Created complex SQL queries and stored procedures.
- Developed the XML schema and Web services for the data support and structures.
- Implemented the Web service client for login verification, credit reports and applicant information using Apache Axis web service.
- Responsible for designing and managing the Sqoop jobs to imports data from Data warehouse platform to HDFS.
- Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios
- Developed custom tool to convert XML messages to objects with JAXB and JPA capabilities to create mapping between the XML and IBM DB2 objects.
- Expertise in implementing various J2EE design patterns like MVC, Data Access Objects (DAO), and Singleton in the development of Multi-Tier distributed Applications
- Developed WebSphere Message Broker for processing messages from various sources like Java Message Service (JMS) providers, Hypertext Transfer Protocol (HTTP) calls, or data read from files.
- Involved in production Hadoop cluster set up, administration, maintenance, monitoring and support
- Used flume to collect the entire web log from the online ad-servers and push into HDFS
- Implemented MapReduce job and execute the MapReduce job to process the log data from the ad-servers
- Deployed the applications on JBoss Application Server.
- Used Oracle database for tables creation and involved in writing SQL queries using Joins and Stored Procedures.
- Developed Web services for sending and getting data from different applications using Restful web service with JAX-RS using jersey.
- Implemented Log4j for the project to compile and package the application, used MAVEN to automate build and deployment scripts
Environment: My Eclipse, Hadoop, Pig, Sqoop, Oozie, MapReduce, HDFS, Hive, Java, HBase, Flume, Oracle 10g, UNIX Shell Scripting, Spring MVC, Maven, Hibernate 4.0, JSP, UML Designing, Core Java, CSS, HTML4
Confidential,Allentown,PA
Sr. Java/J2EE Developer
Responsibilities:
- Involved in analysis, design and development phases of the project. Adopted agile methodology throughout all the phases of the application.
- Worked on developing the application involving Spring MVC implementations and Restful web services.
- Responsible for designing Rich user Interface Applications using JavaScript, CSS, HTML, XHTML and AJAX.
- Developed the spring AOP programming to configure logging for the application
- Involved in the analysis, design, and development and testing phases of Software Development Life Cycle (SDLC).
- Worked in AGILE Team for rapid development and improve coding efficiency.
- Developed code using Core Java to implement technical enhancement following Java Standards.
- Worked with Swing and RCP using OracleADF to develop a search application which is a migration project.
- Implemented Hibernate utility classes, session factory methods, and different annotations to work with back end data base tables.
- Implemented Ajax calls using JSF-Ajax integration and implemented cross-domain calls using JQuery Ajax methods.
- Implemented Object-relational mapping in the persistence layer using Hibernate frame work in conjunction with Spring functionality.
- Used JPA (Java Persistence API) with Hibernate as Persistence provider for Object Relational mapping.
- Used JDBC and Hibernate for persisting data to different relational databases.
- Developed and implemented Swing, spring and J2EE based MVC (Model-View-Controller) framework for the application
- Implemented application level persistence using Hibernate and Spring.
- Data Warehouse (DW) data integrated from different sources in different format (PDF, TIFF, JPEG, web crawl and RDBMS data MySQL, oracle, sql server etc.)
- Used XML and JSON for transferring/retrieving data between different Applications.
- Also wrote some complex PL/SQL queries using Joins, Stored Procedures, Functions, Triggers, Cursors, and Indexes in Data Access Layer.
- Implementing Restful web services architecture for Client-server interaction and implemented respective POJOs for its implementations
- Designed and developed SOAP Web Services using CXF framework for communicating application services with different application and developed web services interceptors.
- Implemented the project using JAX-WS based Web Services using WSDL, UDDI, and SOAP to communicate with other systems.
- Involved in writing application level code to interact with APIs, Web Services using AJAX, JSON and XML.
- Wrote JUnit test cases for all the classes. Worked with Quality Assurance team in tracking and fixing bugs.
- Developed back end interfaces using embedded SQL, PL/SQL packages, stored procedures, Functions, Procedures, Exceptions Handling in PL/SQL programs, Triggers.
- Used Log4j to capture the log that includes runtime exception and for logging info.
- Used ANT as build tool and developed build file for compiling the code of creating WAR files.
- Used Tortoise SVN for Source Control and Version Management.
- Responsibilities include design for future user requirements by interacting with users, as well as new development and maintenance of the existing source code.
Environment: JAVA, J2EE, JDK 1.5, Servlets, JSP, XML, JSF, Web Services (JAX-WS: WSDL, SOAP), Spring MVC, JNDI, Hibernate 3.6, JDBC, SQL, PL/SQL, HTML, DHTML, JavaScript, Ajax, Oracle 10g, SOAP, SVN, SQL, Log4j, ANT.
Confidential
Java/J2EE Developer
Responsibilities:
- Analyzed use cases, created interfaces and designed the core functionality from presentation layer to business logic layer.
- Responsibilities include analysis of applications, designing of the enterprise applications, functional, technical and project management.
- Extensively developed stored procedures, triggers, functions and packages in oracle SQL, PL/SQL.
- Used Rational Rose for developing Use case diagrams, Activity flow diagrams, Class diagrams and Object diagrams in the design phase.
- Consuming web services, parsing WSDL files using DOM parser.
- Developed a fully functional prototype application using JavaScript and Bootstrap, connecting to a Restful server on a different domain.
- Implemented GUI pages by using JSP, HTML, CSS, JavaScript, JQuery, JSON, and AJAX.
- Implemented the online application using Spring MVC framework, Core Java, JSP, Servlet.
- Re-created and transferred the entire workspace to Rational Application Developer (RAD 7.0) in order to employ future benefits over using Eclipse IDE.
- Starting and stopping Weblogic servers and creating DB connection pools, Queue connection factories, configuring SSL and installing certificates.
- Implemented bottom-up approach Web services implementation using WSDL
- Developed web-based customer management software using Facelets, Ice faces and JSF.
- Implemented Ajax Frame works, jQuery tools examples like Auto Completer, Tab Module, and Calendar and Floating windows.
- Developed web services for sending and getting raw Extract data from different applications using SOAP messages.
- Created REST Web Services for the management of data using Apache CXF (JAX-RS).
- Involved in JUnit Testing of various modules by generating the Test Cases.
- Managed connectivity using JDBC for querying/inserting & data management including triggers and stored procedures.
- Developed JavaScript based components using Ext JS framework like GRID, Tree Panel with client reports customized according to user requirements.
- Performed building and deployment of WAR, JAR files on test, stage, and production systems in JBoss application server.
- Used My Eclipse IDE tool and deployed the application in Bea Web logic Application Server using ANT Scripts.
- Helped trainees to finish their assignments using several frameworks such as: Java applet, Spring MVC, JDBC, Struts.
Environment: Core Java, Struts Framework, Spring Framework, Hibernate, JSON, JSP, Servlets, JavaScript, JQuery, Maven, JUnit, JIRA, Tomcat, XML, XSL, ANT, PL/SQL, Oracle, Eclipse IDE, HTML, CSS, UML, Unix.