We provide IT Staff Augmentation Services!

Python Backend/etl Developer Resume

3.00/5 (Submit Your Rating)

SUMMARY:

  • Seeking for a position as a Bigdata Analyst and Data Engineer on building a substantial solution for complex business issues including vast scale information warehousing, ongoing investigation and broadcasting Visualizations using OpenStack technologies.
  • 5 years of IT experience in all phases of SDLC, along with experience in Application Design and software development.
  • Capable of processing large sets of structured, semi - structured and unstructured data and supporting systems application architecture.
  • Experience in Python OpenStack API'S
  • Worked on Datasets related to retail, telecommunication and financial industries.
  • Familiar with the Object-Oriented Programming concepts.
  • Able to assess business rules, collaborate with participants and perform source-to-target data mapping, design and review.
  • Experience in writing Sub Queries, Stored Procedures, Triggers, Cursors, and Functions on SQL-Server Cassandra, HBase (Phoenix SQL), Hive and PostgreSQL databases.
  • Familiar with the AWS cloud services like EC2, Elastic Container Service (ECS), Simple Storage Service (S3) and Elastic MapReduce (EMR).
  • Experience on analyzing the large datasets with In-memory data-structures using Pandas and spark.
  • Written scripts for Read/Write for Hive and HBase through Thrift service.
  • Worked as developer in agile environment with Git as Version Control.
  • Familiar with the development Test Driven Development and Unit & Integration Testing.
  • Hands on experience in parallel, concurrent and reusable programming techniques.
  • Familiar with data ingestion pipeline design, Hadoop architectures and data modeling.
  • Developed Web services using spark and Flask and Django frameworks.
  • Developed and optimized ETL workflows in both legacy and distributed environments.
  • Capable of writing analytical queries efficiently that helps analysts to spot the trends.
  • Experience in working with the IDE’s like Zeppelin, Notebook, PyCharm etc.
  • Experience in using files JSON, XML, Pickle, ORC, AVRO, PARQUET file formats
  • Configured Flume to extract the data from the web server and then loaded into HDFS.
  • Developed UDFs (python) for Pig and Hive to preprocess and filter data sets for analysis in distributed environments.
  • Imported and exported structured, semi-structured, unstructured data from HDFS and SQL databases by batch and streaming applications.
  • Developed data Streaming applications in Hadoop or Bigdata environments using Kafka.
  • Written Spark applications using Pyspark for real-time data analysis by connecting to the multiple data warehouse like Hive and HBase.
  • Worked with docker services and creating application specific docker images.
  • Experience in creating the user interfaces using HTML, CSS and JavaScript
  • Expertise in getting the web data through API’s and web scrapping techniques.
  • Capable of writing the configuration and Deployment Scripts using Fabric and Jenkins.
  • Developed dashboards using Tableau Desktop and Bokeh and D3Js.

TECHNICAL SKILLS:

Language: Python, SQL, C++, GO-Lang, HTML, CSS, JavaScript, Jinja2

Technologies: JDBC, NOSQL, Docker, AWS, Git

Frameworks: Tkinter, Flask, Django

IDE: PyCharm, IDLE, Notebook, Zeppelin

Build Tools: PyBuilder, Pip, Npm, VirtualEnv, Coverage, Jenkins, Docker

Tools: Tableau, Cron, Matplotlib, Pandas, Flume, Splunk, Bubbles (ETL), PySpark, Bokeh, Kafka, Boto3(AWS)

Operating Systems: Windows, Linux, OSX

Big Data Technologies: Hortonworks Hadoop, HDFS, Spark, Oozie, Sqoop, HBase, Hive, Impala, Pig, Flume and Hue, Cassandra, MongoDB

PROFESSIONAL EXPERIENCE:

Confidential

Python Backend/ETL Developer

Responsibilities:

  • Involved in architecture, flow and the database model of the application.
  • Developed the ETL jobs as per the requirements to update the data into the staging database (Postgres) from various data sources and REST API’s.
  • Developed analytical queries in Teradata, SQL-Server, and Oracle.
  • Developed a Web service on the Postgres database using python Flask framework which was served as a backend for the real-time dashboard.
  • Partially involved in the developing the front-end components in the Angular and also editing the HTML, CSS and JavaScript.
  • Wrote Unit and Integration Tests for all the ETL services.
  • Containerized and Deployed the ETL and REST services on AWS ECS through the CI/CD Jenkins pipe.
  • Worked on optimizing and memory management of the ETL services
  • Developed Splunk Queries and the dashboards for the debugging the logs generated by the ETL and the REST services.

Environment: Python, Postgres, Dockers, Teradata, Flask, Gunicorn, AWS, ECS, Jenkins, SQL Server, S3, Kafka, Angular4, D3Js, CSS, HTML5, JavaScript.

Confidential

Python/ETL Tester & Developer

Responsibilities:

  • Created Integrated test Environments for the ETL applications developed in GO-Lang using the Dockers and the python API’s.
  • Appended the Integrated testing environments into Jenkins pipe to make the testing automated before the continuous deployment process.
  • Installed data sources like SQL-Server, Cassandra and remote servers using the Docker containers as to provide the integrated testing environment for the ETL applications.
  • Also wrote Unit tests for the developed scripts for the getting through the quality checks before pushing to the deployments.
  • Worked on optimizing and memory management of the ETL applications developed in Go-Lang and python and also reusing the existing code blocks for better performance.

Environment: GO, python, Cassandra, Dockers, SQL-Server, GO, AWS, EC2, Mesos, Jenkins, S3, Kafka Splunk.

Confidential

Python/Hadoop Developer

Responsibilities:

  • Evaluated business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
  • Wrote scripts in Python for Extracting Data from JSON and XML files.
  • Developed the back-end web services for the worker using Python Flask REST APIs.
  • Designed and Developed the CRUD scripts to load the transactional data into Hive and HBase using the thrift and python scripting.
  • Performed Map/Reduce operations on the raw files located in HDFS for staging and transforming the data using Pig and spark.
  • Collecting the social media data from various REST services and also scrapping the raw web pages using the web scrapping API’s like Scrapy.
  • Wrote the python scripts that get Sentiments and the Insights of the text data collected using the Watson Analytics API.
  • Develop the spark jobs that aggregate the large datasets from HBase and store the aggregated the report into the temporary tables for reporting.
  • Implemented Oozie workflow engine on Hortonworks Hadoop cluster to run multiple ETL jobs developed in python, Pig and spark in orderly manner.
  • Worked on front end frameworks like JavaScript and Bokeh API for responsive web pages.

Environment: Python, AWS, Hortonworks, HDFS, Hive, Kafka, HBase, Dockers, spark, Tableau, Bokeh, Phoenix SQL, Scrapy, XML, HTML, pandas, Watson-Alchemy.

Confidential

Python/Hadoop Developer

Responsibilities:

  • Evaluated business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
  • Developed an ETL service that looks for the files in the server and update the file into the Kafka queue.
  • Developed a data consumer which takes the data from the Kafka queue and load it to the Hive tables.
  • Worked closely with the data scientists for migrating the prediction algorithms/models to Python sciKit-learn API from R-studio and also Involved in the feature selection for creating the prediction models.
  • Involved in designing the in the Hive using the optimizing techniques like bucketing/partitioning to stop the data across the cluster.
  • Created the views in Hive to provide the datasets that are required for building the prediction models.

Environment: Python, Hadoop, sciKit-learn, HDFS, Hive, Hortonworks, Oozie, MapReduce, Spark, Kafka, Tableau.

Confidential

Python Developer

Responsibilities:

  • Writing Python scripts to parse XML documents as well as JSON based REST Web services and load the data in database.
  • Writing ORM’s for generating the complex SQL queries and building reusable code and libraries in Python for future use.
  • Working closely with software developers and debug software and system problems
  • Profiling Python code for optimization and memory management and implementing multithreading functionality.
  • Involved in creating stored procedures that gets the data and help analysts to spot the trends.

Environment: Python, Oracle, JSON, XML

We'd love your feedback!