Sr. Big Data Engineer/etl Resume
2.00/5 (Submit Your Rating)
SUMMARY
- Proven Architect with 15+ years of engineering experience in design and development of large - scale applications in the areas of Data warehouse, Big Data and BI Environments.
- Worked with the Chief Technology Officer to improve software processes and team culture, platform roadmap execution planning, converting ideas and vision into executable tasks.
- Managed geographically distributed teams responsible for developing and maintaining world-class data platforms and systems.
- Dealt with juggling many competing priorities and prioritizing for best results
- Provided technical guidance, mentorship, and assistance as a go to engineering point of contact.
- Highly adaptable to new technologies and enthusiastic in developing Proof of Concepts using them.
- Driven planning and review meetings as Design Architect for geo located teams.
- Leading, code design, reviews for new functionality and bug fixes.
- Participated in cross-functional planning and reviews with leads from the Product Management and Quality Assurance functions.
- Managed escalations, production issues, and new product deployments to customers.
TECHNICAL SKILLS
- R, ML, and Octave.
- Grid Computing, Hadoop HDFS, Pig Latin, Oozie, Hive, HBase, Druid Analytic Server, Spark, Storm.
- Oracle 11g2, SQL Server, MySQL, Teradata v2R5, MSBI 2005(Microsoft Business Intelligence), SSIS, SSAS, SSRS.
- Java, UNIX shell scripting, Visual Basic Applications (VBA).
- SQL, PL/SQL programming, T-SQL
PROFESSIONAL EXPERIENCE
Confidential
Sr. Big Data Engineer/ETL
Responsibilities:
- Responsible to develop new PruDB2DL process with enhanced logging and tracking features for better monitoring etc. To fill the gaps and make the complete Process, designed and developed various tools such as Log Purge and Archive tool, Job Monitor for long running jobs since Confidential did not use AutoSys Notification, Basic DQ process and Job status report for PruDb2DL process.
- Responsible to Integrate and Automate 5 out of 8 NFS feeds for SIA project, using home grown tool FDE (Fast Data Exchange), which is developed using Scala and spark. Made some enhancements to the existing code.
- Delivered the WSG Alerts fees and developed Recon feed for RET team, Automated entire process.
- Developed highly scalable Mass loading tables into Data Lake for various departments Such as GI (approximately 3500 tables.
- Mentored several team members to achieve team goals.
Technologies: Hive, Sqoop, Bash scripting, Scala spark
Confidential
Data Engineer
Responsibilities:
- Designed and deployed 23-25 ETLs for the Data Science Team.
- Provided necessary data for models, setup Model ETL and uploaded scores, models are developed in PySpark.
- Developed tool to generate DSMF definition from excel
- Developed 15-17 DSMF definitions for the monitoring framework
Technologies: Hive, Bash scripting, Python, Oozie, PySpark
Confidential
Sr. Hadoop Consultant
Responsibilities:
- Worked side-by-side with customers and SA partners to evaluate technologies that may fill gaps in features, capabilities or interoperability to enhance SA platforms across the enterprise
- Helped create and enhance technology/product strategy and roadmap
- Engaged with vendors and partners to manage licensing for all of SA products
- Facilitated PoC/PoT in the SA Lab environments
- Defined processes for comprehensive training
- Worked on Splice Machine PoT, backend database is HBase and Process in native or Spark mode.
Confidential
Architect
Responsibilities:
- Delivered Multithreaded Java Based ETL scheduler, including other tools such as TDE (transaction Data Extractor), which extracts internal system’s data into GA. Calculated distance between the stores using Google API for Longitude and Latitude in order to determine neighborhood stores.
- Resolved scaling issues from the earlier versions by implementing Pig and Oozie based frameworks for New ETLs.
- Architected and delivered the latest version of GA, with a new data model including highly scalable Complete ETL, including several features such as monitoring, SLA processing, Re-scheduler capabilities.
- Delivered highly customizable, easy to use Ad hoc Reporting Module for sales team’s needs using VBA, which is extended to generate QA reports.
- Developed several utilities in Pig such as Murmur Hash, UDF to deal with parquet timestamp int96 and PigLoadDB function to read data from DB directly.
- Mentored development teams to achieve our project goals.
Technologies: Hadoop, Hive, HBase, Drill, Monet DB, MySQL, MSSQL Server, Kylin, Oozie, Pig, Druid, Talend and Spark (as backend engine), svn, git.
Confidential, San Jose CA
Sr. Software Engineer
Responsibilities:
- FixStream is advanced cloud operational analytics and visualization platform, which provides the context and clarity to take actions. FixStream’s unique angle of differentiation is around creating deep awareness of the location, health and performance of each specific application and its associated infrastructure resources inside and across cloud data centers.
- Integrated the TCP dump into Meridian system. Provided PoC, feasibility and cost effectiveness of IPOQ. Worked on decoding SSL-TCP packets to classify applications. Developed end-to-end application and deployed the tool using Chef.
Technologies: Hadoop/cloud-based technologies, Storm, Kafka, and elastic search, Rabbit MQ, Java, Jenkins, Chef, git