Consultant - Spark & Scala Resume San Francisco, CA - Hire IT People

SUMMARY

Over 7 years of experience in end to end product/solution development in Big data / Hadoop /Data management, Mobile, Client/Server and Embedded technologies
5 years of experience in developing Big data solutions using Spark with Scala, Spark eco system, Hive, PIG, Flume, Kafka, Mongo DB, HBase, Cassandra, Zookeeper, Sqoop etc
Performance tuning of data analysis using Spark SQL
Expertise in implementing Spark, Scala application using higher order functions for both batch and interactive analysis requirement
Experience of working with Azure Monitoring, Data Factory, Traffic Manager, Service Bus, Key Vault.
Expertise in creating Pipelines, linked services, Databricks delta tables creation in Azure
Extensive experience in applying business logic on Transformed Spark RDD's using Actions, Spark data frames and Data sets
Developed Spark/Scala code to perform ETL transformations
Configured Spark Streaming to receive real time data from the Kafka and store the stream data to HDFS
Hands on experience with SQL, PL/SQL, ETL process and its relational databases including Oracle, MS - SQL and good experience in Shell Scripting
Expertise with cloud computing technologies on AWS
Experienced with NoSQL databases - HBase, MongoDB and Cassandra
Hands on experience in import/export of data using data management tool Sqoop
Experience in using Text, Parquet and Sequence files
Complete knowledge on Hadoop Architecture & its flow
Developed & Delivered various solutions in Predictive analysis and recommendations systems
Expertise in using Java API and Sqoop to export data into DataStax Cassandra cluster from RDBMS.
Experience on Agile methodology using Scrum in different projects

TECHNICAL SKILLS

Programming Languages: Java, Scala, Spark SQL, Spark Streaming, & Spark Core, Scala

NoSQL databases: Mongo DB, HBase & Cassandra

Scripting Languages: Shell scripting

Messaging brokers: Kafka, Flume

Cloud: Microsoft Azure, AWS, Databricks, ADF, S3

Databases & languages: Oracle, SQL, PL/SQL

RTO’s: Embedded LinuxPlatforms: Hadoop, Hive, Pig

Source control tools: SVN, GIT

Web Technologies: HTML, CSS, Servlets, JSP, JavaScript.

Protocols: TCP/IP, UDP, NMEA

IDE: Eclipse and KDE

PROFESSIONAL EXPERIENCE

Confidential, San Francisco, CA

Consultant - Spark & Scala

Environment: Hadoop, Azure, ADF, Databricks, Spark, Spark SQL, Kafka, Spark Structured Streaming.

Responsibilities:

Configure & Setup Azure Hybrid Connection to pull data from WSI Service Layer.
Involved in implementing spark streaming jobs to consume messages from Kafka and load them into delta tables
Processed the data using Spark SQL Data frames.
Worked on Azure Event Hubs for Application instrumentation and for User experience or work flow processing
Providing day to day developer support to Azure customers by resolving escalated, complex creation of ingestion jobs to delta tables in Azure Databricks and exporting data to Teradata Stage tables.
Worked on Azure Blob storage, creating config files which are used as source for the pipelines
Development, running of pipelines, Creating Linked Services in Azure, deployment and maintain Azure jobs from Dev to Prod.
Design and Implement Database Schema and import data and build stored procedures on SQL Azure.
Perform system monitoring, verifying availability of all resources, reviewing system and application logs and verifying the completion of scheduled jobs. Create Apache Spark based model development and implementation to run Business user low latency queries faster using In-memory technique.
Finding the data requested by customer in our systems and sending them response using Spark Data Frames in Azure
Development of Azure Pipelines and enhancing the Core Code written in Scala and integrating it into the Azure pipelines.
Integration testing and bug fixes
Supporting the Developed jobs in Production and Deployment to Production

Confidential, Atlanta, GA

Consultant - Big data/Hadoop, Spark & Scala

Environment: Hadoop, Azure, ADF, Databricks, Spark, Sqoop, Spark SQL, Hive, Cassandra, GitHub, Jenkins, pig, NIFI, Teradata, Postgre-SQL, Kafka, Spark Structured Streaming, Dstreams.

Responsibilities:

Designed the AWS data flow solution to Teradata.
Implemented DMF/DIXE framework for EDS Datawarehouse.
Implemented Sqoop jobs to load the data from DB2 and Oracle into Hive tables.
Involved in implementing spark streaming jobs to consume sensors messages from Tibco and parse the messages and load them into Teradata and PostgreSQL.
Processed the data using Spark SQL Data frames.
Loaded the processed data into HIVE tables.
Worked on Azure Event Hubs for Application instrumentation and for User experience or work flow processing
Providing day to day developer support to Azure customers by resolving escalated, complex creation of ingestion jobs to delta tables in Azure Databricks and exporting data to Teradata Stage tables.
Worked on Azure Blob storage, creating config files which are used as source for the pipelines
Development, running of pipelines, Creating Linked Services in Azure, deployment and maintain Azure jobs from Dev to Prod.
Design and Implement Database Schema and import data and build stored procedures on SQL Azure.
Develop and build jobs for real time data using Kafka and PySpark to process and load into AWS 33 buckets
Build Control M jobs to schedule them at speci c timings, to add interdependencies between jobs, shouting alerts and noti cations as per business use case.
Perform system monitoring, verifying availability of all resources, reviewing system and application logs and verifying the completion of scheduled jobs. Create Apache Spark based model development and implementation to run Business user low latency queries faster using In-memory technique.
Implementing Spark Streaming with Kafka for TSM2.0 where measurements shall be received from different detectors. Knowledge on Trifecta, NiFi. Integration testing and bug fixes
Integration testing and bug fixes

Confidential, Fort Worth, TX

Consultant - Big data/Hadoop, Spark & Scala

Environment: Hadoop, Spark, Java, Scala, SparkSQL, HBase, MongoDB, Sqoop, MySQL, Teradata 14, SQL Assistant, MYSQL, Oracle, Unix, oracle 11/g, TPT, Vertica 5.1, Hadoop, Hive QL, Maestro, UNIX, Windows, Toad, SQL Server

Responsibilities:

Development and ETL Design in Hadoop.
Implemented sqoop jobs to load the data from DB2 and Teradata into Hive tables.
Involved in implementing spark streaming jobs to consume sensors messages from Tibco and parse the messages and load them into Cassandra.
Processed the data using Spark SQL Data frames.
Loaded the processed data into HIVE tables.
Implementing Spark Streaming with Kafka for TSM2.0 where measurements shall be received from different detectors. Knowledge on Trifecta, NiFi. Integration testing and bug fixes
Integration testing and bug fixes
Involved in Data loading from MySQL to Cassandra using Sqoop and fixed the discrepancies that occurred during loading

Confidential, Columbus, OH

Consultant- Big Data/Hadoop, Spark & Scala

Environment: Hadoop, Spark, Java, Scala, SparkSQL, HBase, MongoDB, Sqoop, MySQL

Responsibilities:

Implemented data migration & data cleansing using Scala
Worked on optimizing the SQL queries using Spark SQL component
Loaded the data into Spark RDD/Data frames/Data sets and do in memory data Computation using catalyst optimizer to generate the output response
Developed Spark Jobs pipeline which are used to stream, transform and aggregate data
Importing and exporting large sets of data into HDFS and vice-versa using Sqoop
Involved in deploying all the Spark application jobs on Hadoop cluster
Created HBase tables to store variable data formats of data coming from different portfolios
Experienced in Map Reduce programs to load the data from system generated log file to HBase database
Solved performance issues in Hive with understanding of Joins, Group and aggregation and how does it translate to Map Reduce jobs
Results to deliver onto AWS cluster
Involved in writing build scripts using ANT and MAVEN

We provide IT Staff Augmentation Services!

Consultant - Spark & Scala Resume

San Francisco, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship