Senior Hadoop Developer Resume New York - Hire IT People

PROFESSIONAL SUMMARY:

10 Years of IT industry experience encompassing a wide range of skill sets. Roles and industry verticals.
Certified Big data (HADOOP) developer, Certified in IBM Database Developer and lean concepts
Hands on experience in performing development and data analytics using HADOOP (BIG - DATA) tools and technologies which included HDFS, MAP REDUCE, HIVE, PIG, HBASE, SPARK, FLUME, SQOOP, DMX-sync sort, HDINSIGHT, AZURE and OOZIE.
Strong database skills in DB2, HIVE, Oracle, PL/SQL, MySQL, BigSQL and No-SQL databases like HBASE, familiarity with CASSANDRA.
Experienced in installing, configuring, managing, and testing Big-data “HADOOP” ecosystem components.
Experienced in developing map reduce program using java.
Used Apache Spark with Scala/Python for large-scale data processing, handling real-time analytics and designed ETL.
Experienced in Data warehouse concepts and ETL tools (Teradata).
Experienced usingTeradataSQL Assistant, data import/export, data loading with utilities like BTEQ, Multi Load, Fast Load, and Fast Export on UNIX environments.
Experienced in Stored Procedure, Function, Trigger and macros, SQL Loader.
Experienced in UNIX Shell Scripting
Good knowledge in Maestro, StartTeam, Buildforge, TWS Schduler.
Experienced with workflow schedulers, data architecture including data ingestion pipeline design and data modeling.
Possess functional knowledge in the areas of Insurance systems, Financial Systems, Banking System and Healthcare System.
Good experience in all phases of systems life cycle Development, Testing (Unit test, System test, Integration Testing and Regression Testing) and Pre-Production support.
Proficient in analyzing and translating business requirements to technical requirements and architecture.
Performed Knowledge management in the form of AIDs and Project knowledge and change documents.
Experienced in handling internal and external functional, process and data audits.

TECHNICAL SKILLS:

Big Data Ecosystems: HDFS, Hive, Pig, Map Reduce, Spark Sqoop, HBase, Cassandra, Zookeeper, Flume, DMX, Oozie, Avaro and Hue

Languages: C#, .NET, Java, PL/SQL, Python, Scala Unix shell scripting, Hiveql, Pig scripts.

Data Base: MY SQL, BIGSQL, NOSQL, SQL SERVER, Oracle, Exadata, DMS1100 and DB2, PostgreSQL

Operating System: Unix, Windows, MVS/ESA, ZOS

ETL/Reporting: Teradata

Methodologies: Waterfall, Scrum, and Agile

Tools: RPM, MPP, Test Direct, TWS Scheduler, Clarity, Quality Center, Service Center, SFTP, Teradata Sql assistant, Toda, SSH, HUE, Eclipse, Maven, Putty, BigInsight, Cloudera, Beeline Connect, Visual Studio, Visual Code, Cute FTP, SQL Management Studio, Azure Devops, Team server,Powerbi

PROFESSIONAL EXPERIENCE:

Confidential

Senior Hadoop Developer

Responsibilities:

Planning and designing & end to end setup ofAzuresandbox instances
Actively involved in set up CDH and integrating withAzureData Lake store (ADLS).
Developed numerous Spark jobs in Scala 2.10.x for Data Cleansing and Analyzing Data in Impala 2.1.0.
Developed FTP scripts to bring files from different source to Hadoop data lake.
Responsible for Ingestion of Data from Blob to Kusto and maintaining the PPE and PROD pipelines.
Developed python script for NLP pattern search.
Responsible for creating Hive tables, partitions, loading data and writing hive queries.
Imported and exported the data using Sqoop betweenHadoopDistributed File System (HDFS) and Relational database systems.
Responsible to build different reporting dashboards in Powerbi and publish to cloud for user.

Environment: Azure, HDFS, Azure Devops, HIVE, Scala, PYTHON, OOZIE, Java, SQL Server, UBUNTU/UNIX, Visual code, Scrum/agile, Powerbi, HUE, Putty, Cloudera, Beeline connect, TWS Scheduler.

Confidential

Lead Hadoop Developer

Responsibilities:

Responsible for building scalable distributed data solutions using Hive, Impala, Scala & DMX
Importing and exporting data into HDFS and Hive using Sqoop
Developed FTP scripts to bring files from different source to Hadoop.
Used DMX Sync sort to build ETL pipeline for packed data.
Implanted AB switch logic with Oracle Exadata for parallel processing.
Implemented Partitioning HIVE.
Load and transform large sets of structured, semi structured and unstructured
Deployed Algorithms in Pyspark, using complex datasets.
Experience in using Sequence files AVRO and PARQUET file formats.
Come up with project planning and estimations

Environment: HADOOP, HDFS, MAPREDUCE, HIVE, PIG, Scala, PYTHON, HBASE, OOZIE, DMX sync sort yarn, Spark, Core Java, Oracle Exadata, UBUNTU/UNIX, eclipse, JDBC drivers, MySQL, Linux, XML, CRM, SVN, HUE, Putty, Cloudera, Beeline connect, TWS Scheduler.

Confidential, New York

Lead Hadoop Developer

Responsibilities:

Responsible for building scalable distributed data solutions using Hive, Impala, Spark & Greenplum
Implemented Partitioning, Bucketing in HIVE.
Load and transform large sets of structured, semi structured and unstructured
Deployed Algorithms in Scala with Spark, using complex datasets and done Spark based development with Scala
Created Java UDFs in PIG and HIVE.
Experience in using Sequence files, AVRO, PARQUET and TEXT file formats.
Good working knowledge of Amazon Web Service components like EC2, EMR, S3, EBS, ELB
Come up with estimations and Technical Design Specifications for projects.

Environment: HADOOP, HDFS, MAPREDUCE, HIVE, PIG, Scala, HBASE, OOZIE, yarn, Spark, Core Java, Teradata, SQL, UBUNTU/UNIX, eclipse, Maven, JDBC drivers, MySQL, Linux, AWS, XML, CRM, SVN, HUE, Putty, Cloudera, Beeline connect, TWS Scheduler.

Confidential, CA

Lead Hadoop Developer

Responsibilities:

Create the project using HIVE, BIGSQL, PIG
Involved in data modeling in Hadoop.
Creating Hive tables and working on them using Hiveql.
Written Apache PIG scripts to process the HDFS data.
Involved in data modeling in Hadoop.
Automated tasks using UNIX shell scripts.

Environment: HADOOP, HDFS, MAPREDUCE, HIVE, PIG, Scala, Python, HBASE, OOZIE, yarn, Spark, Core Java, Oracle, SQL, UBUNTU/UNIX, eclipse, Maven, JDBC drivers, Mainframe, MySQL, Linux, AWS, XML, CRM, SVN, PDSH, Putty, BigInsights

Confidential

Senior Developer

Responsibilities:

Understand the requirement and build the HBASE data model
Loaded history Data as well as incremental customer and other data to Hadoop through Hive.
Importing and exporting large data sets from various data sources into HDFS using Sqoop.
Load balancing of data across the cluster and performance tuning of various jobs running on the cluster.
Developed Oozie workflow for scheduling and orchestrating the ETL process.
Developed applications using Eclipse
Performed process enhancement by SQL Tuning.

Environment: HADOOP, HDFS, MAPREDUCE, java, HIVE, Hue, PIG, Flume, SQOOP, HBASE, OOZIE, Yarn, Zookeeper eclipse, Maven, BigInsight

Confidential

Senior Developer

Responsibilities:

Designed TDD (low level) from SRS (High level)
Used Python script to transform the data.
Fixed issues with the existing Fast Load/ Multi Load Scripts in for smooth loading of data in the warehouse more effectively.
Created Bteq scripts with data transformations for loading the base tables.
Generated reports usingTeradataBTEQ.
Worked on optimizing and tuning theTeradataSQLs to improve the performance of batch and response time of data for users.
Fast Export utility to extract large volume of data and send files to downstream applications

Environment: TeradataV2R12,TeradataSQL Assistant, MLOAD, FASTLOAD, BTEQ, Erwin, Unix Shell Scripting, Macros, Stored procedure, Db2, Cobol, Python, SAS, PL/SQL, FileZilla

Confidential

Developer

Responsibilities:

Created and reorganized all types of database objects including tables, views, indexes, sequences, synonyms and setting proper parameters and values for all the objects.
Wrote database triggers, stored procedures, stored functions, and stored packages to perform various automated tasks for better performance.
Created Shell Scripts for invoking SQL scripts.
Effectively made use of Table Functions, Indexes, Table Partitioning, Analytical functions, and Materialized Views
Experience with Performance Tuning for Oracle RDBMS using Explain Plan and HINTS.
Involved in the continuous enhancements and fixing of production problems.
Verified and validated data using SQL queries.

Environment: Oracle 10g, .NET, SQL, PL/SQL, UNIX, SQL*Loader, SQL Navigator, TOAD, SQL DEVELOPER.

We provide IT Staff Augmentation Services!

Senior Hadoop Developer Resume

New, YorK

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship