Hadoop and Spark Developer Resume New York - Hire IT People

SUMMARY

Over all 5+ years of professional experience as Hadoop Developer using Apache Spark Framework and also Oracle Database Administrator
Hands on experience in installing configuring and using Hadoop ecosystem components like Apache Spark, HDFS, HBase, Spark SQL, Sqoop, Zookeeper, Kafka, and Flume.
Good Knowledge on Apache Cassandra and Mongo DB
Hands - on fundamental building blocks of Spark - RDDs and related manipulations for implementing business logics Like Transformations, Actions and Functions performed on RDD.
Depth understanding of Data-frames and Data-Sets in Spark SQL
Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
Designed good understanding of Partitions, bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance
Created Hive external tables, views and scripts for transformations such as filtering, aggregation and partitioning tables.
Expert in performing business analytical scripts using Hive SQL.
Worked on IDE’s such as Eclipse and IntelliJ for developing, deploying and debugging the applications.
Good knowledge on Data Warehousing, ETL development, Distributed Computing, and largescale data processing.
Experienced in work with different file formats like Text, Sequence, Xml and JSON.
Expertise in working with relational databases such as Oracle 10g, SQL Server 2012.
Good knowledge of stored procedures, functions, etc. using SQL and PL/SQL.
Configured Oracle Data guard for Disaster Recovery implementations.
Planning and Support upgrades oracle 10g to 11g to 12c. RMAN Backup and Recovery strategies. Oracle High Availability Solutions
Hands on Data Analysis, Logical and Physical Design, Backup and Recovery, Performance and Tuning, Database installation, and upgrades
Collaborated with the infrastructure, network, database, application, and BI teams to ensure data quality and availability.
Strong knowledge of Software Development Life Cycle and expertise in detailed design documentation
Excellent Communication Skills, Ability to perform at a high level and meet deadlines

TECHNICAL SKILLS

Big Data: HDFS, Apache Spark, Spark SQL, Spark streaming, Zookeeper, Hive, Sqoop, HBase, Kafka, Flume, Yarn, Cassandra, Mongo dBLanguages Java, Scala, SQL/PLSQL, Shell Scripting.

Java Technologies: JSP, Servlets, JDBC, OOPS Concept

Database: MySQL, Mongo DB, Cassandra, Oracle 10g/11g, Microsoft SQL Server 2014

IDE / Testing Tools: Eclipse, IntelliJ IDEA

Operating System: Windows, UNIX, Linux

Tools: SQL Developer, Maven. Hue, TOAD

PROFESSIONAL EXPERIENCE

Confidential, New York

Hadoop and Spark Developer

Responsibilities:

Involved in requirement gathering to connect with business Analysis.
Responsible for creating technical Documents like High-Level Design and low-Level Design specifications.
Installed and configured Cloudera Manager for easy management of existing Hadoop cluster
Configured various property files like core-site.xml, hdfs-site.xml, yarn-site.xml, mapred-site.xml and hadoop-env.xml based upon the job requirement.
Used Sqoop to transfer data between RDBMS and HDFS.
Worked with business functional lead to review and finalize requirements and data profiling analysis.
Implemented complex Spark programs to perform Joins from Different tables
Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Scala, Spark-SQL, Data Frame, and Pair RDD's.
Responsible for creating tables based on business requirements
Show data visualization and to generate reports for clear result.
Loaded and transformed large sets of structured, semi structured and unstructured data in various formats like text, XML and JSON.
Utilized Agile Scrum Methodology to help manage and organize a Project with professor and regular code review sessions.

Environment: Hadoop HDFS, Apache Spark, Spark-Core, Spark-SQL, Scala, JDK 1.8, CHD 5, Sqoop, MySQL, CentOS Linux

Confidential

Hadoop and Spark Developer

Responsibilities:

Worked with the Spark for improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
Developed iterative algorithms using Spark Streaming in Scala for near real-time dashboards.
Developed custom aggregate functions using Spark SQL and performed interactive querying
Designed good understanding of Partitions, bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance
Created Hive external tables, views and scripts for transformations such as filtering, aggregation and partitioning tables.
Handled importing of data from various data sources, performed transformations using Hive, and loaded data into Teradata to HDFS
Expert in performing business analytical scripts using Hive SQL.
Responsible for Building automation jobs and scheduling using atomic scheduler with aorta framework,
Worked with data in multiple file formats including Parquet, Sequence files and Text/ CSV.
Expertise in creating Thought-spot pin boards and Bringing all data for reports as per Business Requirements
Participate in meetings with clients (internal and external), assist in framing projects and designing solutions based on client needs and problems to be solved
Followed agile methodology and SCRUM meetings to track, optimize and tailored features to customer needs.
Gained very good business knowledge on different category of products and designs within.
Involved in developing Thought spot reports and work flows automated to load data

Environment: Hadoop HDFS, Apache Spark, Spark-Core, Spark-SQL, Scala, JDK 1.7, Sqoop, Eclipse, MySQL, AWS EC2, HBase, CentOS Linux and ZooKeeper

Confidential

Hadoop and Spark Developer

Responsibilities:

Involved in Requirement Gathering to connect with BA.
Working Closely with BA & Client for creating technical Documents like High-Level Design and low-Level Design specifications.
Implemented best income logic using Spark SQL
Experienced on loading and transforming of large sets of structured data, semis structured data and unstructured data.
Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
Developing RDDS to schedule various Hadoop Program.
Written SPARK SQL Queries for data analysis to meet the business requirements.
Experienced in defining job flows.
Cluster coordination services through Kafka and Zookeeper.
Serializing JSON data and storing the data into tables using Spark SQL.
Writing Shell scripts to automate the process flow.
Storing the extracted data into HDFS using Flume
Experienced in multiple file formats including XML, JSON, CSV and other compressed file formats
Experience on Kafka and Spark integration for real time data processing
Developed Kafka producer and consumer components for real time data processing.
Experienced writing queries in Spark SQL using Scala
Communicated all issues and participated in weekly strategy meetings

Environment: Hadoop HDFS, Apache Spark, Spark-Core, Spark-SQL, Scala, JDK 1.7, Sqoop, Eclipse, MySQL, CentOS Linux, ZookeeperProject 4

Confidential

Oracle Database Administrator

Responsibilities:

Managing over 20 critical applications single.
Configuring Dataguard for OLTP databases.
Upgrading databases to 12C.
Work with Application team to performance related issues.
Rebuilding of Indexes for better performance, maintenance of Oracle Database.
Generated performance reports and Daily health checkup of the database using utilities like AWR, Statspack to gather performance statistics.
Identified and tuned poor SQL statements using EXPLAIN PLAN, SQL TRACE and TKPROF, analyzed tables, indexes for improving the performance of the Query.
Troubleshooting various issues like database connectivity to users, privileges issue.
Created users and allocated appropriate table space quotas with necessary privileges and roles for all databases.
Wrote script to monitor the database with shell and PL/SQL code or SQL code such as procedure, function and package.
Created or cloned the oracle Instance and databases on ASM. Performed database cloning and re-location activities.
Managed tablespaces, data files, redo logs, tables and its segments.
Maintained data integrity also managed profiles, resources and password security manage Users, privileges and roles.
Performed RMAN backups, restores, cloning, or refreshing databases and applications.
Monitoring and planning ASM storage in all databases.

Environment: Oracle 11g/12C, TOAD, Linux, UNIX, Putty, E-Manager, SQL SERVER, Windows Server, Web services, WebLogic

We provide IT Staff Augmentation Services!

Hadoop And Spark Developer Resume

New, YorK

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship