Sr. Big Data Engineer Resume
Plano, TX
SUMMARY:
- An experienced Big Data - Hadoop developer with varying level of expertise around different Big Data/Hadoop ecosystem projects which include Spark streaming, HDFS, MapReduce, NiFi, HIVE, HBase, Storm, Kafka, Flume, Sqoop, ZooKeeper, Oozie etc.
- In depth and extensive knowledge of Hadoop architecture.
- Extensive experience with Real-time streaming technologies Spark, Storm, Kafka.
- Experience installing and working with HDP.
- Experience in Hadoop cluster sizing and cluster design.
- Extensive experience of Software Design, Development, Middleware programming and Client/Server, Multitier, Web based front end applications in Java, J2EE & web services.
- Expertise in application development using Java, Servlet, JSP, Struts, Spring, Hibernate, JUnit, Log4J, XML, XSD, XSLT, Web Services, ANT, JavaBeans, EJB, JMS, Maven etc.
- Experience working with different file formats like Apache Avro, Sequence file, JSON, XML, and Flat text file.
- Worked closely with business clients and partners to understand and business requirements, develop use case and designed solutions.
- Extensive experience with successful projects execution in On-site and Off-shore models, leading and managing projects and teams independently.
- Import data from database to HDFS and vice versa using Sqoop.
- Extensive 10+ years of experience in large scale developments, deployments and maintenance of middleware applications in Telecommunications industry with Confidential .
TECHNICAL SKILLS:
Big Data/Hadoop Technologies: HDFS, MapReduce, Yarn, Spark, NiFi, Hive, HBase, ZooKeeper, Oozie, Pig, Sqoop, Flume, Apache Avro, Storm, Kafka
Languages: JAVA, PL/SQL, XML, Groovy
Java & Web Technologies: JSP, Servlets, JDBC, JNDI, JMS, EJB, SOA, Web Services, XML Schema, XSLT
Reports: Jasper Frame works Struts, Spring, Hibernate
Scripting: JavaScript
Design Technologies: OOAD with UML (Rational Rose, Microsoft Visio).
Server Technologies: Web Logic, Web Sphere, Apache Tomcat
Operating Systems: Windows, Unix, linux
Databases: Oracle 10g, MySQL, SQL Server, MS Access. DB2
Tools: Rational Rose (Clear Quest, Clear Case), Eclipse, XML Spy, TOAD, ANT, Jenkins, Maven, MS Office, JUnit, CVS, SVN, SOAP UI, Putty, WinSCP, File Zilla, Splunk, Nexus
Methodologies: Waterfall model, Agile
PROFESSIONAL EXPERIENCE:
Confidential
Sr. Big Data Engineer
Responsibilities:
- Develop Spark streaming application to read raw packet data from Kafka topics, format it to JSON and push back to kafka for future use cases purpose.
- Developer high fidelity spark kafka streaming application - which consume json format packet messages and return geo location data to mobile application for requested IMEI.
- Saving HUM packet data in HBASE for future analytics purpose.
- Performance tune up for spark applications.
Environment: Windows 7, CDH, HDFS, Yarn, Spark, Kafka, Hive, Phoenix, HBase, Scala, java, Oozie, Zookeeper, Git.
Confidential
Sr. Big Data Engineer:
Responsibilities:
- Install HDP and HDF (NiFi) multi node cluster.
- Design the architecture of the project.
- Design and develop hive, hbase data structure and Oozie workflow.
- Configure Namenode and Resource Manager high availability on HDP cluster and perform failover testing.
- Develop NiFi workflow to pick up the multiple retail files from ftp location and move those to HDFS on daily basis.
- Develop Spark jobs to store daily retail file to staging tables.
- Implement Type 1 and Type 2 logic of data life cycle through Spark job using Spark SQL.
- Perform cleansing, formatting and processing of daily data through Spark SQL and store that in Hive as well as Phoenix/Hbase.
- Performance tune up Phoenix/Hbase, Hive queries and Spark.
Environment: Windows 2000, OS X Yosemite, NiFi, HDFS, Spark, Yarn, Scala, Hive, Phoenix, HBase, Ambari, java, Oozie, Sqoop, MSSQL Server, Zookeeper, Git, Scala IDE.
Confidential
Big Data Engineer
Responsibilities:
- Develop Nifi workflow to pick up the data from rest API server, from data lake as well as from SFTP server and send that to Kafka broker.
- Implement Spark Kafka streaming to pick up the data from Kafka and send to Spark pipeline.
- Implement NiFi to Spark streaming directly withtout using Kafka internally to provide various options to client in single Confidential .
Environment: OS X Yosemite, NiFi, Spark, Kafka, HDFS, Yarn, Scala, ZooKeeper, Ambari, java, Git, Scala IDE.
Confidential
Big Data Engineer
Responsibilities:
- Design and develop spark streaming data pipeline using scala to process provider, affiliation, patient, header and service line data.
- Design and develop hive and hbase data structure.
- Develop address standardization flow to standardize the incoming claim and provider addresses.
- Develop java-spring based middleware component services to fetch data from Hbase using Phoenix SQL layer for various WEB UI use cases.
- Performance tune up Hbase, Phoenix, Hive queries and Spark streaming code.
Environment: Windows 2000, OS X Yosemite, HDFS, Spark, Scala, Hive, HBase, Phoenix, ZooKeeper, Ambari, java, BMC Datavolve, spring, Vanity soft APIs, Javascript, node.js etc.
Confidential, Plano, TX.
Big Data Engineer
Responsibilities:
- Installation and configuration of HDP service stack.
- Involved in Cluster sizing discussions and cluster design for various component installation on cluster
- Develop map reduce programs for schema validator and IVP validation to validate different versions of xml request & responses.
- Created Sqoop integration scripts to ingest data from HDFS to database and database to HDFS.
- Sqoop ingest data in hive stage environment. Move to Master data set after cleansing using pig and load key meta data information in database.
- Writing HIVE script to load the payload in external table from HDFS.
- Writing scripts for creating, truncating, dropping, altering HBase tables to store the data after execution of map reduce job and to use that for later analytics.
- Cluster Monitoring and Troubleshooting using Ambari. Helping with troubleshooting issues.
- Develop shell scripts and invoke through program to process operations on hdfs file system.
- Design and develop controller to invoke splunk and Nexus adapters for log extraction and kit download to extract schema respectively.
- Schedule nightly batch jobs using Oozie to perform schema validation and IVP transformation at larger scale to take the advantage of the power of Hadoop.
Environment: Windows 2000, Java, Spring, Angular JS, HDFS, Map Reduce, Storm, Kafka, Hive, HBase, Pig, ZooKeeper, Sqoop, Flume, Oozie, HCatalog, Ambari, Shell scripting etc...
Confidential
Java Middleware Technical Lead
Responsibilities:
- Project management from technical deliveries and extensive coordination with offshore team and deliveries on schedule time with quality.
- Communicated project status, including project issues up and down the management chain, including senior management.
- Communicate the project design to other application designers and team members
- Ensure Process & SLA Adherence of the projects and deliverables based on AT&T standards (ITUP) and Agile methodology.
- Managed change requests, including assisting the sponsor and stakeholders to understand the impact of the change on schedule or features
- Involved in planning, creating application designs, validating high level designs (HLDs) to ensure accuracy and completeness against the business requirements and programming the solutions and complete unit testing with unit test plans.
- Responsible to resolve design issues and develop strategies to make ongoing improvements that support system flexibility, performance and metrics reporting.
- Extensively worked on Java, J2EE, XML, Schema designs and on Weblogic server.
- Responsible for release level tasks and involvements in ‘Lessons Learned’ meetings and applied these ideas to improve processes for new projects
- Design validation strategies and review Test Cases, Test Plans prepared by testing team
- Handled multiple projects simultaneously and communicate requirements and status effectively
- Delivered results in assigned timeframes that were sometimes quite short
- Active involvement in the entire Software Development Life Cycle including Design, Development, Testing, Deployment and Support for Software Systems
Environment: Windows XP, Java 1.6, EJB, JDBC, JMS, Weblogic 10.3, Oracle 10i, CVS, SVN, XML, XSD, xml spy, Eclipse, Ant, Maven, Nexus, Jenkins, MS Office 2007 products, Putty (UNIX), J2EE, Web Services, Eclipse, JMS, SVN