Job Seekers, Please send resumes to resumes@hireitpeople.comHadoop Support
Job Description:
- As a Big Data Engineer, you will be member of our Big Data & Analytics team responsible for support and enhancements
- Partner with data analyst, product owners and data scientists, to better understand requirements, finding bottlenecks, resolutions, etc.
- Support/Enhance data pipelines and ETL using heterogeneous sources
- You will Support/enhance data ingestion from various source systems to Hadoop using Kafka, Sqoop, Spark Streaming etc.
- You will transform data using data mapping and data processing capabilities like Spark, Spark SQL
- You will be responsible to ensure that the platform goes through Continuous Integration (CI) and Continuous Deployment (CD) with DevOps automation for any enhancement/bug fixes
- You will be responsible to support/own deployment activities for data pipeline/ETL job enhancements/bug fixes
- Supports Big Data and batch/real time analytical solutions leveraging transformational technologies like Apache Beam
- You will have the ability to research and assess open source technologies and components to recommend and integrate into the design and implementation
- Monitor all data pipeline, data profiling and ETL jobs, will be first responder to any pipeline failures and responsible to debug
- You will work with development and QA teams to enhance existing Pipelines, Integration APIs, and provide Hadoop ecosystem services
- Expands and grows data platform capabilities to solve new data problems and challenges - Migrate existing pipelines from BODS/Mulesoft technology stack to open source technology stack
Basic Qulifications:
- At least 4+ years of experience with the Hadoop ecosystem and Big Data technologies
- Ability to dynamically adapt to conventional big-data frameworks and tools with the use-cases required by the project
- Hands-on experience with the Hadoop eco-system (HDFS, MapReduce, Hbase, Hive, Impala, Spark, Kafka, Kudu, Solr)
- Experience with building stream-processing systems using solutions such as spark-streaming, Storm or Flink etc.
- Experience in other open-sources like Druid, Elastic Search, Logstash etc. is a plus
- Knowledge of design strategies for developing scalable, resilient, always-on data lake
- Experience in agile(scrum) development methodology
- Strong development/automation skills. Must be very comfortable with reading and writing Scala, Python or Java code.
- Excellent inter-personal and teamwork skills
- Ability to work in a fast-paced environment and manage multiple simultaneous priorities - Can-do attitude on problem solving, quality and ability to execute - Bachelor or master’s Degree in engineering in Computer Science or Information Technology is desired.