Sharepoint Resume Profile

Professional Summary

I have 9 years' experience in design, development and deployment of software applications including 4 years' experience in data analysis and cloud computing. I also have 3 years' experience in technologies like PIG, HIVE, HBASE, Hadoop HDFS/MapReduce/YARN and CASSANDRA DataStax . I have experience in software development with parallel and cloud computing, data analysis, design and development, testing and deployment of software systems and applications with emphasis on the object oriented paradigm. I have 2 years of experience with embedded systems development incorporating device drivers for windows based PXA27X development board using Windows Platform Builder and ARM based processors. I have developed DOM APIs in C for a mobile internet browser.
Experience with data modeling, Hadoop MapReduce systems, Distributed Systems and multi-terabyte warehouses and datamarts.
Hands on experience installing, configuring and administrating Hadoop clusters.
Worked closely with administration team while setting up Cloudera CDH4 ad-hoc cluster using Cloudera CDH4 for development purpose.
Developed Java APIs to retrieve, analyze and structure data from NoSQL platforms like CASSANDRA.
Designed and deployed client specific data-migration plans to transfer data to HDFS using SQOOP and FLUME.
Implemented partitioning and bucketing schemes for fast data access in HIVE.
Developed data-analysis implementations in PIG and HIVE. Executed workflows using OOZIE.
Expertise with analyzing, managing and reviewing Hadoop log files.
Experience in importing, processing and exporting data using Sqoop across HDFS and RDBMS systems like MySQL and SQL Server especially where the relational data size was hundreds of gigabytes.
Created UDFs to implement functionality not available in Pig. Used UDFs from Piggybank UDF Repository.
Knowledgeable in SPARK and Scala Framework exploration for transition from Hadoop/MapReduce to SPARK.
Used the Hadoop MapReduce API to write programs in Java to perform custom JOINS, Filters and groups.
Designed, developed and tested core Java applications as the Software development life cycle SDLC process , including multi-threading.
Significant knowledge in J2EE including JSP, Servlets, JMS and spring / Hibernate Framework for building client-server applications.
Designed developed and tested DOM APIs in C for mobile internet browser Involved java specification conversions into C.
Designed, developed and tested device drivers for Windows CE using Platform Builder.
Experience in working with C, MATLAB and HTML. I have experience working on UNIX and LINUX Ubuntu, CentOS .
Team-player with good communication, interpersonal skills and presentation skills and par excellence work-ethic.

Technical Skills:

ADVANCED Big Data Technologies	Cassandra, SPARK, Cloudera CDH4, DataStax
Big Data Technologies	Apache Hadoop HDFS/MapReduce, Hadoop2/YARN , PIG, HIVE, HBASE, SQOOP, FLUME, OOZIE, HUE Knowledgeable in Lucene, Solr and MongoDB
Languages	Java, Spark, J2EE, CQL, Unix, C, Matlab Knowledge in Springs and Hibernate frameworks
Machine Learning / Data Analysis / Statistics	Random Forest, Decision Tree, Hidden Markov Model, Markov Chain Monte Carlo, Principal Component Analysis, Basic statistical distributions Binomial, Normal, Poisson
Web Technologies	JSP, JDBC, HTML, JavaScript
Operating Systems	Windows, UNIX and Linux
Frame Works	Spring, Hibernate, Platform Builder
Version Control	VSS Visual Source Safe
Testing Technologies	Junit

Professional Work Experience

Confidential

Roles and Responsibilities:

Built scalable distributed data solutions using Hadoop.
Used Sqoop to load data to and from HDFS from.
Involved in setting up a cluster of 10 nodes, 160 GB RAM and 20 TB of disk space. Worked with the Hadoop architect and the Linux admin team to set up, configure and initialize the cluster.
Wrote and modified XML scripts to set the configuration properties like node addresses, replication factors, client storage space etc.
Implemented searching, sorting and grouping queries to analyze data from Datastax Cassandra.
Used the Hadoop MapReduce API to write programs in Java to perform custom JOINS, Filters and groups. Used Yarn Architecture and Map reduce 2.0 in the development cluster.
Worked in installing cluster, commissioning decommissioning of Datanodes, Namenode recovery, capacity planning, and slots configuration.
Created Hive tables based on business requirements using Partitioning, Dynamic Partitions and Buckets in HIVE for efficient and fast data access and manipulation. Used piggy bank a repository of UDF's for PigLatin.
Involved in NoSQL database integration and implementation.
Wrote MapReduce jobs to discover user trends.
Involved in managing and reviewing Hadoop log files.
Transferred the analyzed data across relational databases and HDFS using Sqoop enabling BI team to visualize analytics.
Collected the business requirements from Subject Matter Experts like Data Scientists and Business Partners.

Environment: Apache Hadoop 2.3.0, HDFS, Cassandra, MapReduce, Spark, Hive 0.12, Linux, MySQL.

Confidential

Roles and Responsibilities:

Deployed, tested and integrated core java application on for gathering and analysis of data with custom ML libraries to establish patterns and trends in data.
Developed unit test cases and used JUnit for unit testing of the application.
Set-up and monitored a scalable and distributed system based on Hadoop HDFS for proof-of-concept initiation involving multi-node clusters -4 Servers each with 64 GB RAM and 10 GB Ethernet Server adapters with 1000 core clusters- with name node, secondary name node and data nodes.
Extracted the data from staging database into HDFS using SQOOP and FLUME.
Involved in NoSQL Datastax Cassandra database design, integration and implementation.
Created alter, insert and delete queries involving lists, sets and maps in Datastax Cassandra.
Created indices using keys only in Datastax Cassandra.
Worked closely on parallel computing with Spark team to explore RDD in Datastax Cassandra.
Implemented Partitioning, Dynamic Partitions and Buckets in HIVE for more efficient data access.
Encoded statistical data analysis routines on Java APIs to analyze data and understand correlation between anagrams in genomic and bioinformatics data.
Developed custom MapReduce codes, generated JAR files containing user defined functions and integrated it with HIVE to extend the accessibility of statistical procedures within the entire analysis team.
Gathered business requirements to determine feasibility and to convert them to technical tasks in the Design Documents.
Worked closely with business team to gather requirements and add new support features.

Environment: Core Java, Junit, Hadoop HDFS/MapReduce , SQOOP, FLUME, SQL

Confidential

Roles and Responsibilities:

Designed developed and deployed application components using Java Collections and providing concurrent database access using multithreading.
Used Spring/Hibernate frameworks.
Performance tuned the application to prevent memory leaks and to boost its performance and reliability.
Developed a programs to notify the operational team on response and connection time of clients on the network
Created use cases and tested queries before shipping the final production code for validation to support and maintenance team.
Configured the developmental Hadoop Cluster in Local Standalone , Pseudo Distributed, Fully Distributed Mode
Reviewed the HDFS usage and system design for future scalability and fault tolerance.
Built scalable distributed data solutions using Hadoop.
Transferred data using Sqoop across MySQL and HDFS on regular basis.
Used HIVE queries for aggregating the data and mining information sorted by volume and grouped by vendor and product.
Developed custom MapReduce codes, generated JAR files for user defined functions and integrated it with HIVE to extend the accessibility of statistical procedures within the entire analysis team.
Performed statistical data analysis routines using Java APIs to analyze data using.
Weekly meetings with technical collaborators and active participation in code review sessions with senior and junior developers.

Environment: Apache Hadoop, Hive, Core Java, Ubuntu

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship