We provide IT Staff Augmentation Services!

Data Operations Associate Resume

2.00/5 (Submit Your Rating)

Sanfrancisco, CA

SUMMARY

More TEMPthan two years of Fin - Tech, Tech industry exp. in Business Intelligence, ETL Development, Datawarehouse, Data/Statistical Ananlysis

TECHNICAL SKILLS

Programming/scripting Lang.: SQL programming , Pyspark , Python, SAS 9, core java(coursework)

Warehouse/Distributed: PostgreSQL, SQL Server, MYSQL, Hadoop, HIVE

Others: Linux/Unix, shell scripting, SSIS, Cloveretl, Erwin 9, Sqoop, Tableau

PROFESSIONAL EXPERIENCE

Data Operations Associate

Confidential, SanFrancisco, CA

Responsibilities:

  • Created stored procedures/user defined functions/cursors to populate financial statement, financial statement account and financial ratio account tables to run on daily basis and calculate financial ratios only for new statements using Postgre-SQL in AWS.
  • Financial ratio function takes ratio formula from a formula table and calculate the ratios using financial items by generating dynamic SQL
  • Created an ETL pipeline to source these tables and to deliver this calculated ratio data from AWS to Datamart(SQL Server) & CreditEdge server.
  • Handling ETL ( EDF8) production jobs everyday, alongwith fixing daily issues.
  • Created shell script to automate extraction of data from text files(source), followed by manipulation, bulk copying into respective tables in server.
  • Testing & validating the results from old EDF8 engine against new EDF8 engine before deploying it to production and fixing bugs in existing jobs

ETL Developer

Confidential, SanFrancisco, CA

Responsibilities:

  • Created ETL workflow to deliver batch of insights generated by personatics predictive engine from near real time activity to warehouse for reporting.
  • Created simple data model in the datawarehouse to accommodate this data on the daily basis in data source layer using Erwin 9 and cron.
  • Created new data extract from existing data source & merging it to the existing data source, sent to third party(adobe) for keyword optimization
  • Tweaking the existing data model to accommodate these changes and loaded the data using the daily ETL job.

Data Analyst

Confidential, Spartanburg, SC

Responsibilities:

  • Created, validated stored procedure to perform ETL(delta uploads) of data from Oracle data source to inhouse datamarts(SQL Server & Postgres).
  • Coordinated with credit risk and reporting teams to gather their data requirements and developed stored procedures to met their requirements
  • Developed stored procedures programs to extract, clean and manipulate data followed by data validation, to be further used in model building phase
  • Performed analysis of default rate, delinquency rate in different score, income bands to select appropriate multiplier for title loans.
  • Performed bucketing based on loan, score, income and generated summary statistics to detect anomalies in Loan to Income ratio(Fraud Detection)

ETL Developer intern

Confidential, Dallas, Tx.

Responsibilities:

  • Investigated the Legacy System code written in C++ to extract the business logic and created ETL pipelines (GMC PrintNet ETL tool and In-house ETL tool (LEXER)) to transform incoming data feeds into XML file format & validating these generated XMLs.
  • Created XML schemas for the generated output XML files to be further used by the document composition process

BI-ETL Developer intern

Confidential, Dallas, Tx

Responsibilities:

  • Coordinated with reporting team to gather requirements and performed gap analysis between existing warehouse and Oracle Model in ERP system.
  • Created, tested ETL pipeline to load the data from ERP system to Local DW Layer by inserting, updating data into staging and history tables.
  • Created OBAW view along with a tables in DW and mapping document to facilitate the future processes.

Data Analyst intern

Confidential, Dallas, Tx

Responsibilities:

  • Analyzed gamers’ data (70 million records and 102 Features) to halp emulate the bots behavior, depending on gamer skills & detected outliers.
  • Cleaned data, handled missing values and used visualization techniques, data transformation methods to get insights and trends
  • Read, write data in formats like avro, Json, text, sequence, parquet, orc using compressions like Snappy, Gzip using pyspark(RDD, Dataframes)
  • Performed analysis on data to generate the meaningful insights and tan saving it in HDFS and Mysql database using pyspark (sparksql, dataframe)
  • Created list, hash partitions(dynamic and static) in hive database to improve the performance and loading data from mysql and hdfs
  • Integrated(Import and Export) HDFS with Mysql, hive(hive metastore) by creating sqoop job to perform the incremental upload

We'd love your feedback!