Big Data Developer Resume
CharlettE
SUMMARY
- Overall 11 experience in Developent/Support application using Hadoop/Scala/python/RPA Automation.
- Certified CCA Spark and Hadoop Developer . CCA - 175 Certified.
- Certified Certified Aws Solution Architect.
- Certified Certified Scrum Master.
- Domain Experts in Supply Chain, Energy Trasfer and Storage sytem.
- Expertise in creating data wraggling and data cleaning with data dictionary of model.
- Expertise with the tools in Hadoop Ecosystem including Spark, Map-Reduce Hive, Airflow, Impala, HDFS, Zoo- Keeper, Sqoop, Flume, HBase.
- Substantial experience in Spark 3.o integration with Kafka 2.4
- Experience in working Lambda architecture for design ETL batch and Streaming pipeline.
- Sustaining the BigQuery, PySpark and Hive code by fixing the bugs and providing the enhancements required by the Business User.
- Working with AWS/Google cloud using in S3 Bucket, Athena, Aws ETL Glue, Step Function, Lambda Function, Redshift, Data Lake, RDS and EC2 Instance with EMR cluster. Knowledge of GCP cloud as well
- Knowledge of Cloudera plattform & Apache Hadoop 0.20. version.
- Very good exposure in OLAP and OLTP.
- Experienced in managing on-shore and off-shore teams, which includes hiring, mentoring, and in dealing with performance appraisals of team members.
- Extensive experience in SDLC, STLC process development and implementation
- Worked in Core java application development and maintence supoort of AMS.
- Project Management level activity and Audit Like (CMMI,Lean & Project Level Configuration Audit (IPWC)).
TECHNICAL SKILLS
Hadoop/Big - Data: HDFS,Hive,Sqoop,Flume & Zookeper (Cloudera Plattform) Spark 2.0, Data Frame & Spark SQL, Impala, AirFlow, Stone Branch, Nifi.
Python /Scala Technologies: Python, Pandas & Java, PCA,Dimension Reduction,TSNE,CDF,Regression & Classification, Navi Byes, KNN
Cloud: AWS/Gcp Strorage, S3 Bucket & EC2, Route53, RDS, EMR, CICD AWS., RDS, Redshift, Kinesis API,Cloud Spanner,Big Query,Big Table.
Automation Tool: RPA Automation (UI PATH & Win Automation ) & Tool, Maven, Jenkins, GIT & AWS Cloud CICD.
Servers: TOMCAT 5.0,6.0, Web Logic, WebSphere 7.0, 6.1.
Database: SQL,MYSQL,DB2,SQLDBX,ORACLE 9I, 10G
OS: DOS, Windows 98, 2000/NT,UNIX.
Tool: Putty,SSh,Filezila,Winscp,ManageNow,IT2B,VSS, & RCC. Build Fordge, Informatica vesioning tool, Tivoli, ICD Tool, Service now, Splunk,Jira
PROFESSIONAL EXPERIENCE
Confidential
Big Data Developer
Responsibilities:
- Working as Developer in hive and impala for more parallel processing data in Cloudera systems.
- Working in big data technologies like spark 2.3,& 3.0 scala, Hive, Hadoop cluster (Cloudera platform).
- Making a data pipelining with help Data Fabric job,SQOOP,SPARK,scala and KAFKA. Parallel working in data side oracle and MYSQL server for data designing to source to target.
- Worked on Nested struct jason for parsing and do flatten and exploed of complex file structure.
- Design & implement Spark Sql tables, Hive scripts job with stone branch for scheduling and create work flow and task flow.
- We generally used partitions and bucketing for data in hive to get query faster. This part of hive optimization
- Wide experience in Aws with lambda component for design and triggering.
- Used step function for desigfn and create workflow sheduling and implement ETL pipeline.
- Closely work on AWS serice catalog and EMR services with spark transformation with redshift database.
- Write programs using Spark to move data from Storage input location to output location by running data loading, validation, and transformation to the data
- Used scala function, dictionary and data structure (array, list, map) for better code reusability
- Based on Development, we need to do the Unit Testing.
- Prepare the Technical Release Notes (TRN) for the application deployment into the DEV/STAGE/PROD environment.
Environment: HDFS,Hive, Spark,Linux,Kafka, python,Stonebranch, Cloudera, Oracle11g/10g, PL/SQL,Unix,Json and Parquet File systems.
Confidential, Charlette
Big Data Engineer
Responsibilities:
- Analyzing the issue and doing Impact analysis for the same
- Data Ingest from Sqoop & datafabrics jobs from Orcale,DB2 and salesforce
- Work on implementing various stages of Data Flow in the Hadoop ecosystem - Ingestion, Processing, Consumption
- Responsible for wide-ranging data ingestion using Sqoop and HDFS commands. Accumulate ‘partitioned’ data in various storage formats like text, Json, Parquet, etc. Involved in loading data from LINUX file system to HDFS
- Storing Data Files in S3 Buckets daily basis. Using EC2, EMR & S3, Redshift to develop and maintain AWS cloud base solution.
- Start working with AWS for storgae and halding for tera byte of data for customer BI Reporting tools
- Write programs using Spar 2.4 for creating data frame and proccess data with transformation and actions.
- Working on tickets opened by users regarding various incidents, requests
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
- Automated all the jobs for scheduling wise we used CA7 jobs and check the jobs logs.
Confidential, Minesota
Senior System Engineer
Responsibilities:
- Requirements analysis,coding,support the application
- Analyzing the requirements and doing Impact analysis for the same.
- Installation and Configuration of WebSphere Application server 7.0
- Integrating with various webservers, databases & configuration of connection pooling.
- Developed standard and re-usable mappings and mapplets using various transformations like expression, aggregator, joiner, source qualifier, router, lookup, update strategy, etc.
- We working on cloudera paltform and with hadoop 2.0 big data.
- We closely working on hive and sqoop etl data pipe line and design data ware house.
- Created the unit test cases for mappings developed and verified the data.
- Supporting ETL testing and working on various ticktes on data warehouse.
- Working on QA & DEV enviorment for ETL developemnt phase.
Environment: Core Java, L2 & L3 Support Activity,Putty,RCC,WebSphere7.0, SSH, Build fordge.Informatica 9.6, Oracle 10G, Unix, SQL Developer, Flat File,Hadoop, Hive,Sqoop.
Confidential
Senior System Engineer
Responsibilities:
- Supporting middleware team application.
- Analyzing the issue and doing Impact analysis for the same.
- L2 & L3 Supporting application with CC&B system and external system.
- Monitring the CC&B system and IBM APPS team component.
- Monitring CDC job for IVR as part of RDW Team.
- Extensively worked on Data Extraction, Transformation, and Loading with RDBMS, Flat files.
- Data Ingest from Sqoop & flume from Orcale data base.
- Work on implementing various stages of Data Flow in the Hadoop ecosystem - Ingestion, Processing, Consumption
- Working with hive for structure data & Create tables and load data into tables using Hive.
- Involved in loading data from LINUX file system to HDFS
- Importing and exporting data into HDFS and Hive using Sqoop
- Implemented Partitioning, Dynamic Partitions, Buckets in Hive
Confidential
System Engineer
Responsibilities:
- Gas Transfer System (GTS) is a web based application standalone java application.
- GTS primarily provides TRUenergy with the capability to manage gas customer transfers among distributors through market systems. In addition, GTS provides fRe Request SEMA Letter Extract - CMPCLTPB Batch # 1018 23-JUL-2016 unctionality to perform MIRN Discovery andStanding Data requests with participating Distribution Businesses.
- GTS sends transaction Acknowledgments to market systems for Transfer transactions received. GTS is built on a three-tier, client server architecture, Internet explorer is used for presentation on client workstations, a Java Application Server provides all processing capabilities, and an Oracle database on a separate server machine fulfils data requirements. This three-tier approach minimizes deployment complexity, distributes workload and utilizes existing.