We provide IT Staff Augmentation Services!

Sr. Big Data Engineer Resume

3.00/5 (Submit Your Rating)

Plano, TX

SUMMARY:

  • An experienced Big Data - Hadoop developer with varying level of expertise around different Big Data/Hadoop ecosystem projects which include Spark streaming, HDFS, MapReduce, NiFi, HIVE, HBase, Storm, Kafka, Flume, Sqoop, ZooKeeper, Oozie etc.
  • In depth and extensive knowledge of Hadoop architecture.
  • Extensive experience with Real-time streaming technologies Spark, Storm, Kafka.
  • Experience installing and working with HDP.
  • Experience in Hadoop cluster sizing and cluster design.
  • Extensive experience of Software Design, Development, Middleware programming and Client/Server, Multitier, Web based front end applications in Java, J2EE & web services.
  • Expertise in application development using Java, Servlet, JSP, Struts, Spring, Hibernate, JUnit, Log4J, XML, XSD, XSLT, Web Services, ANT, JavaBeans, EJB, JMS, Maven etc.
  • Experience working with different file formats like Apache Avro, Sequence file, JSON, XML, and Flat text file.
  • Worked closely with business clients and partners to understand and business requirements, develop use case and designed solutions.
  • Extensive experience with successful projects execution in On-site and Off-shore models, leading and managing projects and teams independently.
  • Import data from database to HDFS and vice versa using Sqoop.
  • Extensive 10+ years of experience in large scale developments, deployments and maintenance of middleware applications in Telecommunications industry with Confidential .

TECHNICAL SKILLS:

Big Data/Hadoop Technologies: HDFS, MapReduce, Yarn, Spark, NiFi, Hive, HBase, ZooKeeper, Oozie, Pig, Sqoop, Flume, Apache Avro, Storm, Kafka

Languages: JAVA, PL/SQL, XML, Groovy

Java & Web Technologies: JSP, Servlets, JDBC, JNDI, JMS, EJB, SOA, Web Services, XML Schema, XSLT

Reports: Jasper Frame works Struts, Spring, Hibernate

Scripting: JavaScript

Design Technologies: OOAD with UML (Rational Rose, Microsoft Visio).

Server Technologies: Web Logic, Web Sphere, Apache Tomcat

Operating Systems: Windows, Unix, linux

Databases: Oracle 10g, MySQL, SQL Server, MS Access. DB2

Tools: Rational Rose (Clear Quest, Clear Case), Eclipse, XML Spy, TOAD, ANT, Jenkins, Maven, MS Office, JUnit, CVS, SVN, SOAP UI, Putty, WinSCP, File Zilla, Splunk, Nexus

Methodologies: Waterfall model, Agile

PROFESSIONAL EXPERIENCE:

Confidential

Sr. Big Data Engineer

Responsibilities:

  • Develop Spark streaming application to read raw packet data from Kafka topics, format it to JSON and push back to kafka for future use cases purpose.
  • Developer high fidelity spark kafka streaming application - which consume json format packet messages and return geo location data to mobile application for requested IMEI.
  • Saving HUM packet data in HBASE for future analytics purpose.
  • Performance tune up for spark applications.

Environment: Windows 7, CDH, HDFS, Yarn, Spark, Kafka, Hive, Phoenix, HBase, Scala, java, Oozie, Zookeeper, Git.

Confidential

Sr. Big Data Engineer:

Responsibilities:

  • Install HDP and HDF (NiFi) multi node cluster.
  • Design the architecture of the project.
  • Design and develop hive, hbase data structure and Oozie workflow.
  • Configure Namenode and Resource Manager high availability on HDP cluster and perform failover testing.
  • Develop NiFi workflow to pick up the multiple retail files from ftp location and move those to HDFS on daily basis.
  • Develop Spark jobs to store daily retail file to staging tables.
  • Implement Type 1 and Type 2 logic of data life cycle through Spark job using Spark SQL.
  • Perform cleansing, formatting and processing of daily data through Spark SQL and store that in Hive as well as Phoenix/Hbase.
  • Performance tune up Phoenix/Hbase, Hive queries and Spark.

Environment: Windows 2000, OS X Yosemite, NiFi, HDFS, Spark, Yarn, Scala, Hive, Phoenix, HBase, Ambari, java, Oozie, Sqoop, MSSQL Server, Zookeeper, Git, Scala IDE.

Confidential

Big Data Engineer

Responsibilities:

  • Develop Nifi workflow to pick up the data from rest API server, from data lake as well as from SFTP server and send that to Kafka broker.
  • Implement Spark Kafka streaming to pick up the data from Kafka and send to Spark pipeline.
  • Implement NiFi to Spark streaming directly withtout using Kafka internally to provide various options to client in single Confidential .

Environment: OS X Yosemite, NiFi, Spark, Kafka, HDFS, Yarn, Scala, ZooKeeper, Ambari, java, Git, Scala IDE.

Confidential

Big Data Engineer

Responsibilities:

  • Design and develop spark streaming data pipeline using scala to process provider, affiliation, patient, header and service line data.
  • Design and develop hive and hbase data structure.
  • Develop address standardization flow to standardize the incoming claim and provider addresses.
  • Develop java-spring based middleware component services to fetch data from Hbase using Phoenix SQL layer for various WEB UI use cases.
  • Performance tune up Hbase, Phoenix, Hive queries and Spark streaming code.

Environment: Windows 2000, OS X Yosemite, HDFS, Spark, Scala, Hive, HBase, Phoenix, ZooKeeper, Ambari, java, BMC Datavolve, spring, Vanity soft APIs, Javascript, node.js etc.

Confidential, Plano, TX.

Big Data Engineer

Responsibilities:

  • Installation and configuration of HDP service stack.
  • Involved in Cluster sizing discussions and cluster design for various component installation on cluster
  • Develop map reduce programs for schema validator and IVP validation to validate different versions of xml request & responses.
  • Created Sqoop integration scripts to ingest data from HDFS to database and database to HDFS.
  • Sqoop ingest data in hive stage environment. Move to Master data set after cleansing using pig and load key meta data information in database.
  • Writing HIVE script to load the payload in external table from HDFS.
  • Writing scripts for creating, truncating, dropping, altering HBase tables to store the data after execution of map reduce job and to use that for later analytics.
  • Cluster Monitoring and Troubleshooting using Ambari. Helping with troubleshooting issues.
  • Develop shell scripts and invoke through program to process operations on hdfs file system.
  • Design and develop controller to invoke splunk and Nexus adapters for log extraction and kit download to extract schema respectively.
  • Schedule nightly batch jobs using Oozie to perform schema validation and IVP transformation at larger scale to take the advantage of the power of Hadoop.

Environment: Windows 2000, Java, Spring, Angular JS, HDFS, Map Reduce, Storm, Kafka, Hive, HBase, Pig, ZooKeeper, Sqoop, Flume, Oozie, HCatalog, Ambari, Shell scripting etc...

Confidential

Java Middleware Technical Lead

Responsibilities:

  • Project management from technical deliveries and extensive coordination with offshore team and deliveries on schedule time with quality.
  • Communicated project status, including project issues up and down the management chain, including senior management.
  • Communicate the project design to other application designers and team members
  • Ensure Process & SLA Adherence of the projects and deliverables based on AT&T standards (ITUP) and Agile methodology.
  • Managed change requests, including assisting the sponsor and stakeholders to understand the impact of the change on schedule or features
  • Involved in planning, creating application designs, validating high level designs (HLDs) to ensure accuracy and completeness against the business requirements and programming the solutions and complete unit testing with unit test plans.
  • Responsible to resolve design issues and develop strategies to make ongoing improvements that support system flexibility, performance and metrics reporting.
  • Extensively worked on Java, J2EE, XML, Schema designs and on Weblogic server.
  • Responsible for release level tasks and involvements in ‘Lessons Learned’ meetings and applied these ideas to improve processes for new projects
  • Design validation strategies and review Test Cases, Test Plans prepared by testing team
  • Handled multiple projects simultaneously and communicate requirements and status effectively
  • Delivered results in assigned timeframes that were sometimes quite short
  • Active involvement in the entire Software Development Life Cycle including Design, Development, Testing, Deployment and Support for Software Systems

Environment: Windows XP, Java 1.6, EJB, JDBC, JMS, Weblogic 10.3, Oracle 10i, CVS, SVN, XML, XSD, xml spy, Eclipse, Ant, Maven, Nexus, Jenkins, MS Office 2007 products, Putty (UNIX), J2EE, Web Services, Eclipse, JMS, SVN

We'd love your feedback!