Job ID :
33511
Company :
Internal Postings
Location :
Columbus, IN
Type :
Contract
Duration :
6 Months
Salary :
DOE
Status :
Active
Openings :
1
Posted :
16 Sep 2021
Job Seekers, Please send resumes to resumes@hireitpeople.com
Must Have Skills:
  • Must have Azure Cloud experience and understand Azure components like ADF, Azure SQL and Azure DataBricks.
  • Very Strong DataBricks, Spark, Pyspark, DataBricks SQL with Azure
  • Must have Strong ETL and ELT experience
  • Must have strong Python, DataBricks SQL skills beyond just calling Spark Api, must be fluent in Python programming language
  • Must have Relational Database knowledge for optimum loading of Data from on premise system and data lake
  • Must have experience with Data Lake and Data Bases
  • Must have knowledge of OOP and functional programming to create reusable framework for ETL
  • Must understand encryption and security required for PII, Financial and other sensitive data.
  • Must understand Delta Lake and other big data file formats
  • Good to have exposure to DevOps and CICD skills in Big Data Space
  • Good to have Airflow or AppWorx.
  • Good to have exposure to Manufacturing Domain
  • Good to have SQL as well as No SQL databases
  • Good to have AD security experience
Detailed Job Description:
  • Work closely with Architect and Business units in understanding technical requirement and implement independently reusable code.
  • Develop ETL Framework for template driven ETL
  • Develop Databricks code that can call Scala and other required libraries
  • Work with offshore and on shore teams and mentor team members on ETL and do KT on framework and design
  • Implement transformations and aggregations as requirement
  • Work in Agile manner and resolve ambiguous requirements and communicate effectively with his peers
  • Min Experience should have 10+ years of IT Experience, with at least 3 years of Python and DataBricks experience
  • We need a Data Engineer/Lead to develop reusable ETL framework on Azure DataBricks using Spark 3.0.
  • This custom framework will be used to create a template driven data load and transformation system on Data Lake and Big Databases. The Framework shall use components from Azure and DataBricks (Spark).
  • The developer is expected to write Python Test cases and Data Quality checks for all code produced as a precondition for CI/CD and higher environments.

Minimum years of experience*: 10+ years

Certifications Needed: No

Interview Process (Is face to face required?): No

Does this position require Visa independent candidates only? No