Big Data Developer Resume ,San Diego, CA - Hire IT People

SUMMARY:

8 years of experience in software design and development with 5+ years of experience in Hadoop, Data warehousing solutions, Big Data analytics and development.
Sound knowledge and familiarity of Data Journey (Data Ingestion > Transformation > Discovery > Advanced Analytics).
Experience in installation, configuration, supporting and managing Hadoop clusters.
Extensive experience in developing complex MapReduce programs against structured and unstructured data.
Experience in tuning and troubleshooting performance issues in Hadoop cluster.
Extensive experience in working with structured data in HIVE to improve performance by various advanced techniques like Bucketing, Partitioning and optimizing Self - joins.
Experience in using various tools like Sqoop, Flume, Kafka, Nifi, Pig to ingest structured, semi-structured and unstructured data into the cluster and creating complex workflows using Oozie.
Experienced in working with Spark with various data structures like RDD’s, Datasets, and Dataframes and used both sparks’ built in Web UI and advanced Instrumentations like Ganglia to monitor and improve processing times of spark jobs by following Partitioning, broadcasting and check pointing practices.
Experience in creating DStreams and Dataframe from streaming services like Flume, Kafka and performed real-time Spark transformations and actions on it.
Good knowledge in scripting skills in Python, Linux and UNIX Shell
Excellent understanding and knowledge of ETL tools like Informatica, Talend and BI tools like Tableau.
Expertise in working with major NoSQL Database Solutions like Cassandra, Hbase, MongoDB.
Hands-on experience in scripting skills in Python, Linux and Unix Shell.
Good working knowledge on AWS’s stack for big data Analytics (S3, EMR, EC2, Kinesis, DynamoDB, Redshift,ElasticSearch)
Experience in understanding the security requirements for Hadoop and integrate with Kerberos authentication and authorization infrastructure.
Done predictions and built analytic models using various Machine Learning algorithms using Spark ML.
Implemented K-Means, Logistic Regression and SVM for classification for various business scenarios.
Excellent communication and analytical skills and flexible to adapt to evolving technology.
Ability to think independently, creatively solve problems, and to connect various thoughts, observations, and results into innovative solutions.

TECHNICAL SKILLS:

Hadoop
MapReduce
Apache Sqoop
Apache Hive
Impala
Spark
HDFS
Amazon S3
Yarn Kafka
Apache Flume
Zeppelin/Jupyter Apache NiFi
Amazon Notebooks
Tableau
HBase
Cassandra
Distributions: Kinesis
Apache Spark
Spark MLlib
Pandas. MongoDB
CDH 5.x
4.x Storm DynamoDB
Redshift. Hortonworks 2.6 Apache Ranger
Hue
Ambari
Zookeeper
Apache Solr
Elastic Apache Avro
Apache Apache Kerberos
Oozie
Mesos Search
Apache Lucene Parquet
JSON
CSV
Apache Knox core RC Files. Amazon IAM Java
C
C++
Scala
SOAP
REST
JavaScript
Linux Distributions Git
SVN
CVS Python JQuery
XML
HTML
CSS
(Debian penSUSE
AJAX Arch and Fedora)
Windows and Mac OS

PROFESSIONAL EXPERIENCE:

Big Data Developer

Confidential, San Diego, CA

Responsibilities:

Developed and implemented real-time data pipelines with Spark Streaming, Kafka, and Cassandra to replace existing lambda architecture without losing the fault-tolerant capabilities of the existing architecture.
Created a Spark Streaming application to consume real-time data from Kafka sources and applied real-time data analysis models that we can update on new data in the stream as it arrives.
Designed and developed data integration programs in a Hadoop environment with NoSQL data store Cassandra for data access and analysis.
Integrated NoSQL database like Hbase with Apache Spark to move bulk amount of data into HBase.
Responsible for developing data pipeline using Flume, Sqoop and Spark to extract the data from warehouses and weblogs and store it in HDFS. Automated the process by using Oozie workflows.
Experienced in writing Hive UDFs to sort Structure fields and return complex data types based on the required schema.
Experienced with both batch processing and stream processing of data sources using Apache Spark.
Responsible for handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations and other techniques during the Ingestion process itself.

Hadoop Developer

Confidential, Overland Park, Kansas

Responsibilities:

Created a spark streaming applications to aggregate huge files of clickstream data based on Sessions and store the aggregated data in HDFS.
Otimized the processing time of the applications by caching data and partitioning it where appropriate, and tuned the Spark applications via configuration changes.
Developed Kafka producer and consumers for message handling and wrote spark scripts to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
Improved the performance of data transformations by optimizing the existing algorithms in Hadoop MapReduce using Spark Context, Spark-SQL, Data Frames, and Spark YARN.
Worked with Teradata Tools and Utilities (FastLoad, MultiLoad, BTEQ, FastExport)
Developed UDF’s in Python for enabling functionalities and enhancing existing ones for Pig and Hive scripts.
Worked on importing and exporting data from various NoSQL databases frequently.
Involved in Unit testing and delivered Unit test plans and results documents.

Hadoop Developer

Confidential, Atlanta, GA

Responsibilities:

Experience in installation, configuration, supporting and managing Hadoop clusters
Used Storm for extracting the data by designing a topology as per client requirement.
Optimized Hive queries to extract the customer information from Cassandra.
Worked in fine-tuning search queries and designing tables, views, indexes using Cassandra and wrote DDL and DML scripts for data store operations.
Used Zookeeper for cluster co-ordination and Kafka Offset monitoring
Developed and Implemented POC's to load data from Kafka connectors to Cassandra and HDFS.
Integrated Map Reduce with HBase to import bulk amount of data into HBase using Map Reduce programs.
Wrote Map Reduce jobs in java to parse the web logs and store them in HDFS and used MRUnit to test and debug the written MapReduce programs.

Java Developer

Confidential

Responsibilities:

Involved in Full Life Cycle Development. Part of core Design Architect team.
Involve in the team of implementation of all requirements using a wide range of technologies including Java, J2EE, Executor Service Framework, web services and associated technologies.
Developed Business components using Java Objects and used Hibernate framework to map Java classes to database.
Worked with pojo class mappings using spring and hibernate and extensively used Jquery and Javascript for validations.
Used Hibernate for the backend persistence and designed and built SOAP web service interfaces.
Designed and developed Hibernate configuration and session-per-request design pattern for making database connectivity and accessing the session for database transactions respectively. Used HQL and SQL for fetching and storing data in databases
Worked in multi-threaded environment for process based applications to interrupt threads in embedded systems and actively involved I in deployment planning and build setup as per technical specifications.
Worked on enhancement of the application which are already implemented and focused on high-performance improvements to new and existing software applications.

Java Developer

Confidential

Responsibilities:

Involve in the team of architecting and implementation of all requirements using a wide range of technologies including Java, Spring MVC and used Hibernate for the backend persistence.
Used Spring framework for dependency injection and integrated with Hibernate and JSF.
Involved in writing Spring Configuration XML file that contains object declarations and dependencies.
Developed core java classes using OOPS Concepts
Provide technical support on client side issues.
Performed business systems analysis and proposed solutions that fit requirements.
Deployment planning and build setup as per technical specifications.
Worked on creating POJO classes for spring framework.
Worked on enhancement of the application which are already implemented and focused on high-performance improvements to new and existing software applications.
Got an oppourtunity to work on multi-threading environment.

We provide IT Staff Augmentation Services!

Big Data Developer Resume

San Diego, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship