Job ID :
19405
Company :
Internal Postings
Location :
DURHAM, NC
Type :
Contract
Duration :
7+ months
Salary :
Open
Status :
Active
Openings :
1
Posted :
18 Dec 2018
Job Seekers, Please send resumes to resumes@hireitpeople.com
Big Data Engineer - ETL/ Spark
Location: Durham, NC (100% Onsite)
Duration: 6 months to hire perm 


Required Skills:
-Database building and design experience
-Basic linux/shell scripting experience
-Experience with Intermediate/Advanced ETL techniques
-Experience with Spark 

Job Profile Summary:
The Data Engineer will work closely with senior engineers, data scientists and other stakeholders to design and maintain moderate to advanced data models. The Data Engineer is responsible for developing and supporting advanced reports that provide accurate and timely data for internal and external clients. The Data Engineer will design and grow a data infrastructure that powers our ability to make timely and data-driven decisions.

Job Description
· Extract data from multiple sources, integrate disparate data into a common data model, and integrate data into a target database, application, or file using efficient programming processes
· Document, and test moderate data systems that bring together data from disparate sources, making it available to data scientists, and other users using scripting and/or programming languages
· Write and refine code to ensure performance and reliability of data extraction and processing
· Participate in requirements gathering sessions with business and technical staff to distill technical requirement from business requests
· Develop SQL queries to extract data for analysis and model construction
· Own delivery of moderately sized data engineering projects
· Define and implement integrated data models, allowing integration of data from multiple sources
· Design and develop scalable, efficient data pipeline processes to handle data ingestion, cleansing, transformation, integration, and validation required to provide access to prepared data sets to analysts and data scientists
· Ensure performance and reliability of data processes
· Define and implement data stores based on system requirements and consumer requirements
· Document and test data processes including performance of through data validation and verification
· Collaborate with cross functional team to resolve data quality and operational issues and ensure timely delivery of products
· Develop and implement scripts for database and data process maintenance, monitoring, and performance tuning
· Analyze and evaluate databases in order to identify and recommend improvements and optimization
· Design eye-catching visualizations to convey information to users

Hiring Requirements
· Bachelor's degree in Computer Science or related field or equivalent experience
· 3 years of SQL programming skills (Intermediate to Advance SQL programming skills)
· 3 years programming experience in Python, R or other programming language
· Demonstrated experience working with large and complex data sets
· Experience with business intelligence tools (Tableau)

Hiring Preferences
· Experience with Hadoop, Hive and/or other Big Data technologies
· Experience with ETL or Data Pipeline tools
· Experience with query and process optimization
· Experience working in AWS and/or using Linux based systems
· Ability to translate task/business requirements into written technical requirements
· Reliable task estimation skills
· Excellent quantitative, problem solving and analytic skills
· Ability to document data pipeline architecture and design
· Ability to collaborate effectively with business stakeholders, performance consultants, data scientists, and other data engineers
· Proficient in use of MS Office applications including expert level Excel programming
· Ability to quickly become an expert in operational processes and data of lines of business
· Ability to troubleshoot and document findings and recommendations
· Ability to communicate risks, problems, and updates to leadership
· Ability to keep up with a rapidly evolving technology space

Required Skills:
-Database building and design experience
-Basic linux/shell scripting experience
-Experience with ER (entity relationship) tools
-Experience with Intermediate/Advanced ETL techniques

The good to have but, not required skills?
-Experience with AWS, Hadoop, Alteryx, Tableau
-Experience with Master Dataset Management
-Experience with data warehousing.