Big Data Engineer Resume
San Ramon, CA
PROFESSIONAL SUMMARY:
- 9+ years of experience in Software Analysis, Design, Development and Implementation of Java/J2EE applications and Hadoop Technology stack.
- Over 3+ years of hands on experience working with HDFS, Map Reduce and Hadoop ecosystem like Hive, HBase, Sqoop, Flume, Pig, Zookeeper, Apache Kakfa and Storm.
- Experience in developing ETL logic using Talend, including interfacing with Amazon AWS SQS queues and S3 buckets and developing REST API’s in Talend.
- Experience implementing DW solutions on Greenplum & HAWQ databases, to support analytics.
- Developed UI application using AngularJS, consuming REST data from Apache Solr (search engine).
- Strong knowledge in NOSQL column oriented databases like Mongo DB, HBase and integrating them with Hadoop cluster.
- Worked on various distributions like Pivotal, Cloudera and Apache.
- Over 5 years of experience developing applications using J2EE Design Patterns like creational, structural and behavioral design patterns (MVC Architecture, Singleton, Factory, Struts, etc).
- Experience working with Tableau BI, developing reports and interfacing with Greenplum Database to run analytics.
- Strong Experience in Internet Technologies with experience in J2EE, JSP, Struts, Servlets, JDBC, Apache Tomcat, JBoss, WebLogic, WebSphere, SOAP Protocol, XML, XSL, XSLT, Java Beans, HTML.
- Experience writing web services using Axis framework.
- In depth understanding and exposure to all phases of OOA/OOD and OOP.
- Strong SQL, PL/SQL experience and very good exposure to MS SQL Server, MYSQL, Oracle & Greenplum databases.
- Involved in Confidential Central (java project) to generatepolicy summary, system usage & forensic summaryreports, by parsing the huge logs uploaded by multiple customers.
- Good Experience in all the Phases of Software Development Life Cycle.
- Knowledge of hibernate persistence technology and web services.
- Experienced working in the Linux, UNIX and Windows environment.
- Experience in methodologies such as Agile, Scrum and Test driven development.
- Strong program analyzing skills, with ability to follow project standards.
- Strong ability to understand new concepts and applications.
- Excellent Verbal and Written Communication Skills have proven to be highly effective in interfacing across business and technical groups.
TECHNICAL SKILLS:
Big Data: Hadoop, Map Reduce, HBase, Hive, Pig, Sqoop, Flume, Yarn, Talend, Apache Solr, Kafka, Storm, Cassandra.
Languages: Java, J2EE, Python, PL/SQL, Map Reduce.
Databases: Oracle 10/11g, MS SQL Server, DB2, MS - Access, MYSQL, Sybase IQ, HBase, Cassandra, MongoDB, GemFire, Greenplum, HAWQ, PostgreSQL.
Technologies: Java, JDBC, Spring Framework, JPA, Hibernate, Web Services, Struts, JSP, Servlets, ANT, Maven, AngularJS, XML, JSON.
Tools: Eclipse, Maven, Gradle, SBT, Ant, Soap UI, Soap Sonar, JDeveloper, Sql developer, JIRA, Eclipse MAT, IBM thread dump Analyzer, Tableau, Talend.
App Servers: IBM Web Sphere, Web Logic, Apache Tomcat, J-Boss, Jetty, TCServer.
Version control: Git, Perforce, Subversion.
PROFESSIONAL EXPERIENCE:
Confidential, San Ramon, CA
Big Data Engineer
Responsibilities:
- Wrote Map Reduce jobs (java) to implement data science algorithm in Hadoop, and also for data preprocessing.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from Oracle into HDFS using Sqoop.
- Creating procedures/functions in HAWQ & green plum (mpp database) to retrieve desired result set for reporting.
- Created external tables in Greenplum, to load the processed data back into GPDB using GPload utility and gpfdist.
- Written Storm topology to accept events from Kafka producer and emit into Hive, and generate alerts.
- Responsible for data mapping after scrubbing for given set of requirements, used Complex SQL Queries as part of Data Mining and perform analysis, design and Data Modeling.
- Designing overall ETL load process to move data incrementally and transform into new data model.
- Created Talend Jobs to extract data from Hadoop and ingest in green plum (mpp database) incrementally.
- Created Talend jobs to read messages from Amazon AWS SQS queues & download files from AWS S3 buckets.
- Developed REST API in Talend.
- Support all data needs for application development, including data source identification, analysis, and performance tuning and data migration.
- Work with businesses to understand and define the business problem, develop a working prototype, prepare data for analysis.
- Applied data cleansing/data scrubbing techniques to ensure consistency amongst data sets.
- Built REST API using Spring boot.
- Wrote Python scripts for text parsing/mining
- Developed UI application using AngularJS, integrated with Apache Solr to consume REST.
- Worked along with Tableau developers to help performance tune the visualizations graphs/analytics.
- Worked on loading data into Gemfire XD database, using table functions.
Technologies: HDFS, MapReduce, Hive, HBase, Talend, Java, Spring Boot, JPA, Cloud Foundry, HAWQ, JSON, XML Python, Pivotal HD, Tableau, Greenplum, GPload, Gpfdist, Gemfire XD, Angular JS, UNIX Shell Scripts.
Confidential, Santa Clara Ca
Software Engineer
Responsibilities:
- Worked on developing map reduce jobs to parse huge customer logs, for analytics and report generation.
- Moving all log files generated from various sources to HDFS for further processing.
- Responsible for support and maintenance, by troubleshooting any issues the customer is facing with the Adaptive Authentication (On Premise) product and help resolving them.
- Worked closely with customers, professional services team to troubleshoot product/installation related issues and providing resolutions.
- Helped customers in customizing/configuring Adaptive Authentication product for their specific needs.
- Assisted customers in capturing thread/heap dumps on different application servers and analyzed them to identify application hangs and performance issues/memory related issues.
- Analyzed AWR reports to troubleshoot/identify performance bottlenecks.
- Provide Root Cause Analysis of incidents to stakeholders and clients.
- Managed communications to customers at all levels to maintain positive relationships.
- Reported software bugs (after reproducing in local environment) and customer suggestions (enhancement requests) to development and product management teams.
- Worked closely with Professional Services team to verify new installs/upgrades and transitioning them over to support.
- Created internal environments to reproduce customer reported issues and test/verify hot fixes and upgrades.
- Created technical notes to contribute to knowledgebase.
- Performed 24/7 on call duties once a month.
Environment: Java, Web Services (WSDL, SOAP), Spring, Hibernate, XML, ANT, Maven, drools, Oracle, MSSQL, DB2, JDBC, JSON, Map Reduce.
Confidential
Java Developer
Responsibilities:
- Involved in Full Software Development Life Cycle (SDLC).
- Worked with Object Oriented design patterns such as Factory classes. Developed few Factory Classes which act as controllers and diverts the HTTP Request to a particular Request Handler Class based on the Request Identification Key.
- Developed Interfaces using JSP, based on the Users, Roles and Permissions. Screen options were displayed on User permissions. This was coded using Custom Tags in JSP using Tag Libraries.
- Developed code to handle web requests involving Request Handlers, Business Objects, and Data Access Objects
- Build process with ANT framework to build and deploy the application. Various ANT Tasks were defined for Compile, Build, Deploy, Check-in, and Checkout from CVS.
- Worked with DB (DBO) classes and used JDBC drivers from different vendors. Used Various JDBC drivers such as MSSQLSERVER, WEBLOGIC for SQL Server, MYSQL connector for MYSQL Database
- Designed and developed a user usage logging facility using Apache Log4J 1.2.8. Used different Levels of Loggers Such as INFO, DEBUG, WARN, ERROR and FATAL.
- Installed and administered SQL Server 2000. Implemented maintenance plans including backups, security, check Integrity, Optimization, all with documentation.
- Designed and developed database in Oracle.
- Involved in Export/Import of data using Data Transformation Services (DTS). Imported data from flat files to various databases and vice-versa.
- Worked and Modified the Database Schema according to the Client requirement.
- Adopted three-tier approach consisting of Client Tier, Business Logic Tier, and Data Tier.
- Tested the entire System according to the Use Cases using JMeter.
- Involved in tracing and troubleshooting large volumes of source code using logging tools like log4j, and classes such as Print Writer.
- Used XML, XSL and XSLT for developing a dynamic and flexible system for handling data.
- Packaged and deployed the entire application code to integration testing environment for all the releases.
- Implemented using Extreme Programming in Coding. Programmers followed all the standards in the coding.
Environment: MS SQL Server 2000, Functions, Views, Oracle 8i/9i, MYSQL, Linux8, Windows2000, Apache Tomcat, Web Sphere 5.0, WSAD, J2EE (Java 1.4, Servlets, JSP, JDBC-SQL), HTML, XML, UML, Eclipse 3, JMeter 2.0, JavaScript, CVS, ANT 1.5.1, JUnit, Log4J 1.2 8.