Sr. Big Data Architect Resume
Nyc, NY
SUMMARY:
- 9+ years of experience in SDLC with key emphasis on the trending Big Data Technologies expertise in development of Hadoop/Big Data, web based technologies with different back end databases.
- Knowledge in configuration and managing - Cloudera’s Hadoop platform along with CDH3 & 4 clusters.
- Knowledge and experience of architecture and functionality of NOSQL DB like Cassandra and Mongo DB.
- Experienced in developing web based GUIs using JavaScript, JSP, HTML, JQuery, XMLand CSS3.
- Experienced in Collected logs data from various sources and integrated in to HDFS using Flume and experience in developing custom UDFs for Hive.
- Experienced in testing data in HDFS and Hive for each transaction of data.
- Experienced in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
- Excellent understanding of Hadoop architecture and underlying framework including storage management.
- Experienced in Worked on NoSQL databases - Hbase, Cassandra & MongoDB, database performance tuning & data modeling.
- Experience in using PL/SQL to write Stored Procedures, Functions and Triggers.
- Excellent technical and analytical skills with clear understanding of design goals of ER modeling for OLTP and dimension modeling for OLAP.
- Experience in Cloudera, HortonWorks, MapR and Amazon Web Services distributions of Hadoop.
- Experience in working with java for writing custom UDFs to extend Hive and Pig core functionality.
- Strong expertise in implementing End-to-End Big Data projects in Hadoop Ecosystem - Hortonworks Data Platform, Cloudera distribution, HDFS, Map-Reduce, Hive, Pig, Kafka, Spark, Hbase, Oozie and EMR.
- Experience in using Hadoop ecosystem components like Map Reduce, HDFS, Hbase, Zoo Keeper, Hive, Sqoop, Pig, Flume, Spark, Cloudera.
- Expertise in various Java/J2EE technologies like JSP, Servlets, Hibernate, Struts, spring.
- Strong expertise on Amazon AWS EC2, S3, Kinesis and other services
- Experience includes Requirements Gathering, Design, Development, Integration, Documentation, Testing and Build.
- Experience in working with Map Reduce programs, Pig scripts and Hive commands to deliver the best results.
- Diverse experience utilizing Java tools in business, Web, and client-server environments including Java Platform, Enterprise Edition (Java EE), Enterprise Java Bean (EJB), Java Server Pages (JSP), Java Servlets (including JNDI), Struts, and Java database Connectivity (JDBC) technologies.
- Hands on experience in Core Java, Servlets, JSP, JDBC, C#, JavaScript.
- Knowledge of Eclipse tool for the development of Java projects.
- Proficient in developing applications using Java/J2EE design patterns and industry's best design practices.
- Good middleware skills in J2EE, web services with application servers - Tomcat web server, BEA WebLogic, IBM WebSphere, JBoss with experience on heterogeneous operating systems.
- Extensive experiences in Log4j for creating logs of different categories.
- Good knowledge with web-based UI development using jQuery UI, jQuery, ExtJS, CSS3, HTML, HTML5, XHTML and JavaScript.
- Experience with unit testing, functional Testing, system Testing, Integration testing the applications using JUnit, Mockito, Jasmine and Cucumber, PowerMock & EasyMock.
TECHNICAL SKILLS:
Databases: Microsoft Access, MS SQL Oracle 12c/11g/10g/9iNo SQL Databases:: Cassandra, mongo DB
Web/Application servers: Apache Tomcat6.0/7.0/8.0, JBoss
Operating Systems:: UNIX, Ubuntu Linux and Windows, Centos, Sun Solaris.
Network protocols: LAN and WAN,TCP/IP fundamentalsHadoop/Big Data:: Oozie, Flume, Scala, Akka, Kafka, Storm, MapReduce, HDFS, Hive, Pig, Hbase, Zookeeper, Sqoop,Java/J2EE Technologies: JDBC, Java Script, JSP, Servlets, JQuery
Web Technologies: HTML, DHTML, XML, XHTML, JavaScript, CSS, XSLT, EME AWS.
Frameworks: Spring, Hibernate MVC, Struts,.
Languages: Python, XPath, Spark,Java, J2EE, PL/SQL, Pig Latin, HQL
PROFESSIONAL EXPERIENCE:
Confidential, NYC NY
Sr. Big Data Architect
Responsibilities:
- Implement enterprise grade platform (mark logic) for ETL from mainframe to NOSQL (cassandra).
- Experience on BI reporting with At Scale OLAP for Big Data.
- Responsible for importing log files from various sources into HDFS using Flume
- Worked on tools Flume, Storm and Spark.
- Expert in performing business analytical scripts using Hive SQL.
- Implemented continuous integration & deployment (CICD) through Jenkins for Hadoop jobs.
- Worked in writing Hadoop Jobs for analyzing data using Hive, Pig accessing Text format files, sequence files, Parquet files.
- Experience in different Hadoop distributions like Cloudera (CDH3 & CDH4) and Horton Works Distributions (HDP) and MapR.
- Experience in integrating oozie logs to kibana dashboard.
- Extracted the data from MySQL, AWS RedShift into HDFS using Sqoop.
- Developed Spark code using Scala and Spark-SQL for faster testing and data processing.
- Imported millions of structured data from relational databases using Sqoop import to process using Spark and stored the data into HDFS in CSV format.
- Developed Spark streaming application to pull data from cloud to Hive table.
- Used Spark SQL to process the huge amount of structured data.
- Assigned name to each of the columns using case class option in Scala.
- Implemented Spark GraphX application to analyze guest behavior for data science segments.
- Enhancements to traditional data warehouse based on STAR schema, update data models, perform Data Analytics and Reporting using Tableau
- Implementation of Big Data ecosystem (Hive, Impala, Sqoop, Flume, Spark, Lambda) with Cloud Architecture
- Experience on BI reporting with At Scale OLAP for Big Data.
- Implemented solutions for ingesting data from various sources and processing the Data-at-Rest utilizing Big Data technologies such as Hadoop, Map Reduce Frameworks, Hbase, Hive
- Loaded and transformed large sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts.
- Designed and Developed Real time Stream processing Application using Spark, Kafka, Scala and Hive to perform Streaming ETL and apply Machine Learning.
- Identify query duplication, complexity and dependency to minimize migration efforts
- Experience in AWS, implementing solutions using services like (EC2, S3, RDS, Redshift, VPC)
- Worked as a Hadoop consultant on (Map Reduce/Pig/HIVE/Sqoop).
- Worked with Spark and Python.
- Worked using Apache Hadoop ecosystem components like HDFS, Hive, Sqoop, Pig, and Map Reduce.
- Lead architecture and design of data processing, warehousing and analytics initiatives.
- Worked with AWS to implement the client-side encryption as Dynamo DB does not support at rest encryption at this time.
- Exploring with the Spark for improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
- Used Data Frame API in Scala for converting the distributed collection of data organized into named columns.
- Performed data profiling and transformation on the raw data using Pig, Python, and Java.
- Experienced with batch processing of data sources using Apache Spark.
- Developing predictive analytic using Apache Spark Scala APIs.
- Involved in working of big data analysis using Pig and User defined functions (UDF).
- Created Hive External tables and loaded the data into tables and query data using HQL.
- Used Sqoop to efficiently transfer data between databases and HDFS and used Flume to stream the log data from servers.
Environment: Hadoop, MapReduce, HDFS, Hive, Pig, Hue, Ganglia, Nagios, Java, Kafka, Elastic Search, SQL, Scala, Oracle, Netezza, Ambari, Sqoop, Flume, Oozie, Java (jdk 1.6), Eclipse.
Confidential, Florhan Park, NJSr. Big Data/Hadoop Developer
Responsibilities:
- Supporting data analysis projects by using Elastic MapReduce on the Amazon Web Services (AWS) cloud performed Export and import of data into s3.
- Worked on MongoDB by using CRUD (Create, Read, Update and Delete), Indexing, Replication and Sharding features.
- Involved in designing the row key in Hbase to store Text and JSON as key values in Hbase table and designed row key in such a way to get/scan it in a sorted order.
- Integrated Oozie with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Map-Reduce, Pig, Hive, and Sqoop) as well as system specific jobs (such as Java programs and shell scripts).
- Worked on custom talend jobs to ingest, entich and distribute data in Cloudera Hadoop ecosystem.
- Creating Hive tables and working on them using Hive QL.
- Designed and Implemented Partitioning (Static, Dynamic) Buckets in HIVE.
- Developed multiple POCs using PySpark and deployed on the YARN cluster, compared the performance of Spark, with Hive and SQL and Involved in End-to-End implementation of ETL logic.
- Developed syllabus/Curriculum data pipelines from Syllabus/Curriculum Web Services to HBASE and Hive tables.
- Worked on Cluster co-ordination services through Zookeeper.
- Monitored workload, job performance and capacity planning using Cloudera Manager.
- Involved in build applications using Maven and integrated with CI servers like Jenkins to build jobs.
- Exported the analyzed data to the RDBMS using Sqoop for to generate reports for the BI team.
- Worked collaboratively with all levels of business stakeholders to architect, implement and test Big Data based analytical solution from disparate sources.
- Creating the cube in talend to create different types of aggregation in the data and also to visualize them.
- Involved in Agile methodologies, daily scrum meetings, spring planning.
- Involved in full life cycle of the project from Design, Analysis, logical and physical architecture modeling, development, Implementation, testing.
- Scripts were written for distribution of query for performance test jobs in Amazon Datalake.
- Created Hive Tables, loaded transactional data from Teradata using Sqoop and Worked with highly unstructured and semi structured data of 2 Petabytes in size
- Developed MapReduce (YARN) jobs for cleaning, accessing and validating the data.
- Created and worked Sqoop jobs with incremental load to populate Hive External tables.
- Developed optimal strategies for distributing the web log data over the cluster importing and exporting the stored web log data into HDFS and Hive using Sqoop.
- Apache Hadoop installation & configuration of multiple nodes on AWS EC2 system
- Developed Pig Latin scripts for replacing the existing legacy process to the Hadoop and the data is fed to AWS S3.
- Responsible for building Scalable distributed data solutions using Hadoop Cloudera.
- Designed and developed automation test scripts using Python
- Integrated Apache Storm with Kafka to perform web analytics and to perform click stream data from Kafka to HDFS.
- Writing Pigscripts to transform raw data from several data sources into forming baseline data.
- Analyzed the SQL scripts and designed the solution to implement using Spark
- Implemented Hive Generic UDF's to in corporate business logic into Hive Queries.
- Responsible for developing data pipeline with Amazon AWS to extract the data from weblogs and store in HDFS.
- Uploaded streaming data from Kafka to HDFS, Hbase and Hive by integrating with storm.
- Analyzed the web log data using the Hive QL to extract number of unique visitors per day, page views, visit duration, most visited page on website
Environment: Hadoop, HDFS, Map Reduce, Hive, Pig, Hbase, Sqoop, Oozie, Maven, Python, Shell Scripting, CDH, MongoDB, Hbase, Cloudera, AWS (S3, EMR), SQL, Python, Scala, Spark, RDBMS, Java, HTML, PySpark, JavaScript, WebServices, Kafka, Strom, Talend.
Confidential, Mineapolis MNSr. Java Fullstack Developer
Responsibilities:
- Developed CSS based page layouts that are cross-browser compatible and standards-compliant.
- Developed HTML views with HTML5, CSS3, Bootstrap and Angular JS 1.0. Developed new requirements with Spring, Struts and Hibernate.
- Used JQuery for basic animation and end user screen customization purposes.
- Developed creative intuitive user interfaces that address business and end-user needs, while considering the technical, physical and temporal constraints of the users.
- Developed internationalized multi-tenant SaaS solutions with responsive UI's using Java or ReactJS, with NodeJS and CSS.
- Involved in the development of presentation layer and GUI framework using Angular JS and HTML.
- Built different modules, controllers, templates, custom filters and directives in Angular JS.
- Designed dynamic and browser compatible pages using HTML5, CSS3, JQuery and JavaScript.
- Involved in rendering additional components with custom HTML tags using React.js
- Used groovy and Spring boot to collecting the data from users and packaged the data as JSON distributed to applications.
- Provide expert technical leadership to customers and partners regarding all aspects of Pivotal Cloud Foundry (PCF).
- Ensure the successful architecture and deployment of enterprise grade PaaS solutions using PCF as well as • proper operation during initial Application migration and net new development.
- Involved in writing application level code to interact with APIs, Web Services using AJAX and Angular resources.
- Developed code for Responsive web design in JavaScript using frameworks like Angular.js, React.js.
- Experience in developing cloud based application using Spring Cloud and Pivotal cloud foundry
- Responsible for making responsive web pages using twitter bootstrap and media queries.
- Enhanced user experience by designing new web features using MVC Framework like Backbone.js and Node.js.
- Implemented Grails Services and controllers to perform actions.
- Experience in upgrading and migrating various versions of Mongo database on different platforms.
- Making changes to the existing web applications and creating new components using React.js.
- Reported bugs and tracked defects using JIRA.
- Worked with Agile technology.
- Managed projects with GRUNT task runner.
- Used the functionalities to write code in HTML5/HTML, CSS3/CSS, Angular.js, JavaScript, JQuery, Ajax, JSON, and Bootstrap with MySQL database as the backend.
- Involved in Developer Testing, Review and Trouble shooting.
- Developed UI tests with Protractor and Java tests in JUnit.
- Used Jenkins for Continuous Integration. Used TOAD for managing, monitoring and analyzing the database.
- Used Maven for build application.
- Designed and Developed automation script using Selenium Web Driver in Eclipse.
- Used LAMPto suitable for building dynamic web sites and web applications.
- Handled response data from RESTFUL webservices using XML, JSON and JQuery to update UI Interacted with java controllers (JQuery, Ajax, and JSON to write/read data from back end systems).
- Created GET/PUT request and response using RESTFUL web services.
Environment: Java, J2EE, Swing, Oracle 11g, MySQL, Eclipse 3.4, WebLogic 9.2, GUI, Spring, Hibernate,HTML, HTML5, CSS3JavaScript, JUnit, Angular js 2.0, React.js, Backbone.js, node.js, JQuery, Web services,Maven, Jenkins, Redux Toad, Grunt, Tortoise SVN, Putty, LAMP, Visio, Team track, Quality Center.
ConfidentialSr. Java/J2EE Developer
Responsibilities:
- Used JSF framework to implement MVC design pattern.
- Developed and coordinated complex high quality solutions to clients using J2SE, J2EE, Servlets, JSP, HTML, Struts, Spring MVC, SOAP, JavaScript, JQuery, JSON and XML.
- Wrote JSF managed beans, converters, and validator's following framework standards and used explicit and implicit navigations for page navigations.
- Designed and developed Persistence layer components using Hibernate ORM tool.
- UI designed using JSF tags, Apache Tomahawk & Rich faces.
- Oracle 10g used as backend to store and fetch data.
- Experienced in using IDE's like Eclipse and Net Beans, integration with Maven
- Creating Real-time Reporting systems and dashboards using xml, MySQL, and Perl
- Working on Restful web services which enforced a stateless client server and support JSON (few changes from SOAP to RESTFUL Technology) Involved in detailed analysis based on the requirement documents.
- Involved in Design, development and testing of web application and integration projects using Object Oriented technologies such as Core Java, J2EE, Struts, JSP, JDBC, Spring Framework, Hibernate, Java Beans, Web Services (REST/SOAP), XML, XSLT, XSL, and Ant.
- Designing and implementing SOA compliant management and metrics infrastructure for Mule ESB infrastructure utilizing the SOA management components.
- Used Node JS for server side rendering. Implemented modules into Node JS to integrate with designs and requirements.
- JAX-WS used to interact in front-end module with backend module as they are running in two different servers.
- Responsible for Offshore deliverables and provide design/technical help to the team and review to meet the quality and time lines.
- Migrated existing Struts application to Spring MVC framework.
- Provided and implemented numerous solution ideas to improve the performance and stabilize the application.
- Extensively used LDAP Microsoft Active Directory for user authentication while login.
- Developed unit test cases using JUnit.
- Created the project from scratch using Angular JS as frontend, Node Express JS as backend.
- Involved in developing perl script and some other scripts like java script
- Tomcat is the webserver used to deploy OMS web application.
- Used SOAP Lite module to communicate with different web-services based on given WSDL.
- Prepared technical reports &documentation manuals during the program development.
Environment: Java, J2EE, Swing, Oracle, MySQL, Eclipse, WebLogic, GUI, Spring, Hibernate,HTML, HTML5, CSS3JavaScript, JUnit, Angular js 2.0, React.js, Backbone.js, node.js, JQuery, Web services,Maven, Jenkins, SVN, Putty, LAMP, Visio, Team track, Quality Center.