We provide IT Staff Augmentation Services!

Big Data Engineer Resume

0/5 (Submit Your Rating)

San Ramon, CA

PROFESSIONAL SUMMARY:

  • 9+ years of experience in Software Analysis, Design, Development and Implementation of Java/J2EE applications and Hadoop Technology stack.
  • Over 3+ years of hands on experience working with HDFS, Map Reduce and Hadoop ecosystem like Hive, HBase, Sqoop, Flume, Pig, Zookeeper, Apache Kakfa and Storm.
  • Experience in developing ETL logic using Talend, including interfacing with Amazon AWS SQS queues and S3 buckets and developing REST API’s in Talend.
  • Experience implementing DW solutions on Greenplum & HAWQ databases, to support analytics.
  • Developed UI application using AngularJS, consuming REST data from Apache Solr (search engine).
  • Strong knowledge in NOSQL column oriented databases like Mongo DB, HBase and integrating them with Hadoop cluster.
  • Worked on various distributions like Pivotal, Cloudera and Apache.
  • Over 5 years of experience developing applications using J2EE Design Patterns like creational, structural and behavioral design patterns (MVC Architecture, Singleton, Factory, Struts, etc).
  • Experience working with Tableau BI, developing reports and interfacing with Greenplum Database to run analytics.
  • Strong Experience in Internet Technologies with experience in J2EE, JSP, Struts, Servlets, JDBC, Apache Tomcat, JBoss, WebLogic, WebSphere, SOAP Protocol, XML, XSL, XSLT, Java Beans, HTML.
  • Experience writing web services using Axis framework.
  • In depth understanding and exposure to all phases of OOA/OOD and OOP.
  • Strong SQL, PL/SQL experience and very good exposure to MS SQL Server, MYSQL, Oracle & Greenplum databases.
  • Involved in Confidential Central (java project) to generatepolicy summary, system usage & forensic summaryreports, by parsing the huge logs uploaded by multiple customers.
  • Good Experience in all the Phases of Software Development Life Cycle.
  • Knowledge of hibernate persistence technology and web services.
  • Experienced working in the Linux, UNIX and Windows environment.
  • Experience in methodologies such as Agile, Scrum and Test driven development.
  • Strong program analyzing skills, with ability to follow project standards.
  • Strong ability to understand new concepts and applications.
  • Excellent Verbal and Written Communication Skills have proven to be highly effective in interfacing across business and technical groups.

TECHNICAL SKILLS:

Big Data: Hadoop, Map Reduce, HBase, Hive, Pig, Sqoop, Flume, Yarn, Talend, Apache Solr, Kafka, Storm, Cassandra.

Languages: Java, J2EE, Python, PL/SQL, Map Reduce.

Databases: Oracle 10/11g, MS SQL Server, DB2, MS - Access, MYSQL, Sybase IQ, HBase, Cassandra, MongoDB, GemFire, Greenplum, HAWQ, PostgreSQL.

Technologies: Java, JDBC, Spring Framework, JPA, Hibernate, Web Services, Struts, JSP, Servlets, ANT, Maven, AngularJS, XML, JSON.

Tools: Eclipse, Maven, Gradle, SBT, Ant, Soap UI, Soap Sonar, JDeveloper, Sql developer, JIRA, Eclipse MAT, IBM thread dump Analyzer, Tableau, Talend.

App Servers: IBM Web Sphere, Web Logic, Apache Tomcat, J-Boss, Jetty, TCServer.

Version control: Git, Perforce, Subversion.

PROFESSIONAL EXPERIENCE:

Confidential, San Ramon, CA

Big Data Engineer

Responsibilities:

  • Wrote Map Reduce jobs (java) to implement data science algorithm in Hadoop, and also for data preprocessing.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from Oracle into HDFS using Sqoop.
  • Creating procedures/functions in HAWQ & green plum (mpp database) to retrieve desired result set for reporting.
  • Created external tables in Greenplum, to load the processed data back into GPDB using GPload utility and gpfdist.
  • Written Storm topology to accept events from Kafka producer and emit into Hive, and generate alerts.
  • Responsible for data mapping after scrubbing for given set of requirements, used Complex SQL Queries as part of Data Mining and perform analysis, design and Data Modeling.
  • Designing overall ETL load process to move data incrementally and transform into new data model.
  • Created Talend Jobs to extract data from Hadoop and ingest in green plum (mpp database) incrementally.
  • Created Talend jobs to read messages from Amazon AWS SQS queues & download files from AWS S3 buckets.
  • Developed REST API in Talend.
  • Support all data needs for application development, including data source identification, analysis, and performance tuning and data migration.
  • Work with businesses to understand and define the business problem, develop a working prototype, prepare data for analysis.
  • Applied data cleansing/data scrubbing techniques to ensure consistency amongst data sets.
  • Built REST API using Spring boot.
  • Wrote Python scripts for text parsing/mining
  • Developed UI application using AngularJS, integrated with Apache Solr to consume REST.
  • Worked along with Tableau developers to help performance tune the visualizations graphs/analytics.
  • Worked on loading data into Gemfire XD database, using table functions.

Technologies: HDFS, MapReduce, Hive, HBase, Talend, Java, Spring Boot, JPA, Cloud Foundry, HAWQ, JSON, XML Python, Pivotal HD, Tableau, Greenplum, GPload, Gpfdist, Gemfire XD, Angular JS, UNIX Shell Scripts.

Confidential, Santa Clara Ca

Software Engineer

Responsibilities:

  • Worked on developing map reduce jobs to parse huge customer logs, for analytics and report generation.
  • Moving all log files generated from various sources to HDFS for further processing.
  • Responsible for support and maintenance, by troubleshooting any issues the customer is facing with the Adaptive Authentication (On Premise) product and help resolving them.
  • Worked closely with customers, professional services team to troubleshoot product/installation related issues and providing resolutions.
  • Helped customers in customizing/configuring Adaptive Authentication product for their specific needs.
  • Assisted customers in capturing thread/heap dumps on different application servers and analyzed them to identify application hangs and performance issues/memory related issues.
  • Analyzed AWR reports to troubleshoot/identify performance bottlenecks.
  • Provide Root Cause Analysis of incidents to stakeholders and clients.
  • Managed communications to customers at all levels to maintain positive relationships.
  • Reported software bugs (after reproducing in local environment) and customer suggestions (enhancement requests) to development and product management teams.
  • Worked closely with Professional Services team to verify new installs/upgrades and transitioning them over to support.
  • Created internal environments to reproduce customer reported issues and test/verify hot fixes and upgrades.
  • Created technical notes to contribute to knowledgebase.
  • Performed 24/7 on call duties once a month.

Environment: Java, Web Services (WSDL, SOAP), Spring, Hibernate, XML, ANT, Maven, drools, Oracle, MSSQL, DB2, JDBC, JSON, Map Reduce.

Confidential

Java Developer

Responsibilities:

  • Involved in Full Software Development Life Cycle (SDLC).
  • Worked with Object Oriented design patterns such as Factory classes. Developed few Factory Classes which act as controllers and diverts the HTTP Request to a particular Request Handler Class based on the Request Identification Key.
  • Developed Interfaces using JSP, based on the Users, Roles and Permissions. Screen options were displayed on User permissions. This was coded using Custom Tags in JSP using Tag Libraries.
  • Developed code to handle web requests involving Request Handlers, Business Objects, and Data Access Objects
  • Build process with ANT framework to build and deploy the application. Various ANT Tasks were defined for Compile, Build, Deploy, Check-in, and Checkout from CVS.
  • Worked with DB (DBO) classes and used JDBC drivers from different vendors. Used Various JDBC drivers such as MSSQLSERVER, WEBLOGIC for SQL Server, MYSQL connector for MYSQL Database
  • Designed and developed a user usage logging facility using Apache Log4J 1.2.8. Used different Levels of Loggers Such as INFO, DEBUG, WARN, ERROR and FATAL.
  • Installed and administered SQL Server 2000. Implemented maintenance plans including backups, security, check Integrity, Optimization, all with documentation.
  • Designed and developed database in Oracle.
  • Involved in Export/Import of data using Data Transformation Services (DTS). Imported data from flat files to various databases and vice-versa.
  • Worked and Modified the Database Schema according to the Client requirement.
  • Adopted three-tier approach consisting of Client Tier, Business Logic Tier, and Data Tier.
  • Tested the entire System according to the Use Cases using JMeter.
  • Involved in tracing and troubleshooting large volumes of source code using logging tools like log4j, and classes such as Print Writer.
  • Used XML, XSL and XSLT for developing a dynamic and flexible system for handling data.
  • Packaged and deployed the entire application code to integration testing environment for all the releases.
  • Implemented using Extreme Programming in Coding. Programmers followed all the standards in the coding.

Environment: MS SQL Server 2000, Functions, Views, Oracle 8i/9i, MYSQL, Linux8, Windows2000, Apache Tomcat, Web Sphere 5.0, WSAD, J2EE (Java 1.4, Servlets, JSP, JDBC-SQL), HTML, XML, UML, Eclipse 3, JMeter 2.0, JavaScript, CVS, ANT 1.5.1, JUnit, Log4J 1.2 8.

We'd love your feedback!