Senior Big Data Developer

Job Seekers, Please send resumes to resumes@hireitpeople.com

Detailed Job Description:

3 years of each: Python, PySpark, and Strong SQL.
Ability to work in a UNIX environment.
5+ years of experience in processing large volumes and variety of data (Structured and unstructured data, writing code for parallel processing, XMLS, JSONs, PDFs).
3+ years of experience – using Hadoop platform and performing analysis. Familiarity with Hadoop cluster environment and configurations for resource management for analysis work.
Detail oriented. Excellent communication skills (verbal and written) this person might have a few hours of meetings per day.
Must be able to manage multiple priorities and meet deadlines.
Degree in Computer Science, Statistics, Economics, Business, Mathematics or related field.

Job Responsibilities:

Cleanse, manipulate and analyze large datasets (Structured and Unstructured data – XMLs, JSONs, PDFs) using Hadoop platform.
Develop Python, PySpark, Spark scripts to filter/cleanse/map/aggregate data.
Be able to build Dashboards in R/Shiny for end user consumption.
Manage and implement data processes (Data Quality reports).
Develop data profiling, deduping logic, matching logic for analysis.
Programming Languages experience in Python, PySpark and Spark for data ingestion.
Programming experience in BigData platform using Hadoop platform.
Present ideas and recommendations on Hadoop and other technologies best use to management.

Experience required: 5 Years