We provide IT Staff Augmentation Services!

Java/scala/big Data Developer Resume

2.00/5 (Submit Your Rating)

Irving, TexaS

PROFESSIONAL SUMMARY:

  • Over 8+ years of Information technology experience with skills in core Java, server side J2EE development, Data Warehousing/Data Staging/ETL tool, design and development, testing and deployment of software systems from development stage to production stage with giving emphasis on Object oriented paradigm.
  • More than 4+ years’ experience with Big Data Analysis and the batch processing tools in Apache Hadoop Ecosystem including Pig, Hive, HDFS, MapReduce and Sqoop
  • Excellent knowledge on Hadoop Architecture {Hadoop 1 & 2} including YARN and ecosystems such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm
  • Experience in analyzing data using HiveQL, Pig Latin, HBase, Impala and custom Map Reduce programs in Java.
  • Experienced with NoSQL databases like HBase, Cassandra and MongoDB and hands on work experience in writing applications on HBase and Cassandra.
  • Experienced in using Tableau server ODBC driver to connect to various back end data sources like Hiver Server, MySQL, Impala for extracting data and creating reports and dashboards.
  • Experience in developing Microservices with Spring boot using Java and Akka framework using Scala
  • Experienced of real time streaming platform like Apache Flume, Apache Kafka, Apache Spark (Streaming, batch and SQL) and Apache Cassandra used for Internet of Things(IOT) use cases
  • Experienced in container based tools like docker in combination with Puppet and Jenkins
  • Knowledge of manipulating/analyzing large datasets and finding patterns and insights within structured and unstructured data.
  • Good knowledge in working and analyzing Healthcare claim(837P/I) transaction data {procedure codes, diagnosis codes, NPI, Taxonomy, provider specialty)

TECHNICAL SKILL SET:

Hadoop Ecosystem: Apache Hadoop (HDFS/MapReduce),Yarn, Pig, Hive, HBase, Sqoop, Flume,Apache Spark

Advanced Big Data Technologies: DataStax Cassandra Enterprise 4.6, Cloudera CDH4, HDP 2.0,Mapr 4.0.1

Programming Languages: Java 1.8, Scala 2.11.8, Groovy Script, SQL, R(Statistics)

Statistics/Machine Learning: Linear and multivariate Regression Models, PCA

RDBMS(SQL): MySQL, MS SQL Server, SQLite, PostgreSQL, Oracle 12 G

NoSQL/Time Series databases: Cassandra, HBase, MongoDB, Influx DB

Java Technologies /Framework: JDBC, Multi - threading, JSP, XML

Web/Application Server: Apache Tomcat,Red Hat JBOSS Drools and BRMS

Operating System: Windows, Unix(OSX) and Linux(Unbuntu, CentOS, Debian)

IDE and Software: Eclipse Luna, Net beans, R studio, IntelliJ Idea Minitab

Reporting/BI /Visualization/Monitoring tool: Tableau, MicroStrategy, Grafana

WorkFlow Tool: Atlassian Jira, Confluence, Service Now

Version Control: GitHub, Stash

Scripting Environment: Linux/Unix Shell, Bash

Application build tool: Apache maven, Ant, Scala Build Tool(SBT), Atlassian Maven Plugin Suite (AMPS), npm, yarn

Verticals: Healthcare, Telecom networking, Cable Network

PROFESSIONAL EXPERIENCE:

Confidential, Irving, Texas

Java/Scala/Big Data Developer

Responsibilities:

  • Architected and Developed the web application implementing MVC Architecture integrating JPA, Hibernate ORM for CRUD operation, Hibernate Search and spring 4.x framework
  • Developed MicroServices using Spring boot and core Java/J2EE hosted on AWS to be called by Confidential Fios Mobile App
  • Worked with Tier 3 support team to troubleshoot the Confidential Fios and IPTV customer issues {provisioning STB devices, Wifi connectivity issue, in-home and out of home connection problems, reboot the STB, ordering and service assurance issues} by developing different APIs and automated flows
  • Developed Splunk dashboard and reports based on metrics and KPI collected using custom application logging using Splunk REST API
  • Implemented Server-to-Server(S2S) Authentication based on token based system using Java to access Remote REST API
  • Developed Microservices based on Restful web service using Akka Actors and Akka-Http framework in Scala which handles high concurrency and high volume of traffic
  • Developed REST based Scala service to pull data from ElasticSearch/Lucene dashboard, Splunk and Atlassian Jira
  • Implemented the real time data analytics pipeline as part of Enterprise messaging pattern while using Kafka and spark streaming
  • Actively involved with senior team members in modelling the data for persisting the data into different back end databases
  • Developed Spark SQL job as a part of ETL project which will aggregate reporting data ingest into Hive and Hbase for Data warehouse and reporting purpose
  • Developed Spark SQL job as a part of ETL project which will aggregate JSON input data and write it to Cassandra database for reporting purpose
  • Executed POC for generating Apache Solr indexes using Spark batch to be stored in Cassandra
  • Developed Restful web services using Spark hidden Rest API which invokes Spark batch Jobs
  • Developed Prototype for monitoring real time application metrics/KPI using Influx DB and Grafana through Kafka and Spark
  • Experienced with deploying Spark job with standalone, Yarn and Mesos mode
  • Developed and implemented stored procedure and another database query using Jtds JDBC 3.0 library for SQL server 2012
  • Developed native Scala/Java library using Jsch to remotely execute Auto Logs Perl Scripts
  • Created and implemented a custom grid system using CSS grid system and jQuery JavaScript library
  • Developed complex automation JIRA workflows including project workflows, screen schemes, permission scheme, triggering Jira Event Listener API and notification schemes in JIRA using Atlassian Jira Plugin API based on core Java and Adaptavist Script runner based groovy scripts
  • Worked with Jenkins/Puppet based CI process to promote the code in dockerized container in AWS instance

Environment: Java 1.8, Groovy 2.4.1, Apache Spark 2.1,Apache HBase, Apache Hive, Scala 2.11.8,Spring MVC 4.x, Hibernate 4.x,Akka, Akka http, Atlassian Jira 7.2.6, Splunk 6.5.2 ,Microsoft SQL Server 2012 ,SBT, Apache Maven, AMPS ,Apache Ant , IntelliJ IDEA, Eclipse Neon, Apache Tomcat 8

Confidential, Richardson, Texas

Hadoop /Big Data Developer

Responsibilities:

  • Worked in MapR 4.1 hadoop YARN cluster in a development/pre-prod and prod cluster of 50 nodes {16 cores,256 GB each and with 1 TB storage per node}
  • Worked in a team of 12 people in an onshore/offshore SDLC model involving Business analyst and developers
  • Involved in creating dynamic partitioned hive tables, internal/external table and views for reporting and Business Intelligence Purposes by extracting it from EDW tables
  • Extended Hive and Pig core functionality by writing custom UDF.
  • Responsible for developing simple and complex Jobs using Hive QL, Pig Latin and Impala for analysis.
  • Optimized the hive join by using map-side, Cost-based Optimization and using filters
  • Exported the analyzed data into relational databases using Sqoop and Hive for visualization and to generate reports for the BI team using Tableau
  • Used tableau to connect the hive tables using ODBC driver and develop dashboard on it
  • Worked with 837 EDI transaction data along with network, provider and member data
  • Developed Sqoop scripts to import and export data from RDBMS to HDFS, Hive and Hbase
  • Experienced in In-memory and streaming data POC using Spark and Apache Drill
  • Developed automated hive scripts which updates hive external tables generating Solr indexes
  • Developed Lucene based search query to search data from Solr-Hive tables
  • Automated the various task using bash/shell and linux Scripting

Environment: Mapr 4.1, hive 0.13, Eclipse, MobaXterm, Pig, Tableau, Hbase, Sqoop, Tableau 8.2

Confidential, Atlanta ,GA

Big Data/Spark/Cassandra Developer

Responsibilities:
  • Worked as a Cassandra/Spark Engineer in a development/pre-production and Production environment Cassandra Cluster of 11 nodes {16 core processor/64 GB RAM with 1 TB SSD and 1 Gbps NIC card} and 5 Red Hat JBoss Application severs (Drools, Fuse and BRMS).
  • Worked with solution architect in designing the architecture, documenting the enterprise standard High-Level Design document and Low-Level Design document along with gathering the requirement.
  • Operated, maintained, configured and monitored the Cassandra Using DataStax Opscenter, JMX utility (Jconsole) and various Linux utility.
  • Carrying out the performance testing/Benchmarking for 11 node clusters to evaluate the application and cluster using out of the box tool such as YCSB and Cassandra-Stress Testing.
  • Involvement in the development of DAO using DataStax core-Java Driver and RESTful web service Involvement in the design and development of Data Modeling for Cassandra Keyspace and tables using CQL 3
  • Carryout the benchmarking results to simulate the real-world data ingestion pattern and analyzed the read/write/ Insert/Update latency, CPU/Memory and IOPS.
  • Involved in performance tuning, configuring and optimization of Cassandra cluster by changing the parameters of Read operation, Compaction, Memory Cache, Row cache.
  • Designed the backup, failure and recovery plan for recovering data, creating backups for entire cluster.
  • Developed Lucene based indexes on top of HDFS data by writing MapReduce Jobs
  • Worked with offshore team to integrate the GUI adapter to search data using Solr queries
  • Configured and integrated Jenkins with puppet and chef to trigger automatic and continuous builds and execute Junit tests in production environment.
  • Coordinated with offshore/onshore team and arranged the weekly meeting to discuss and track the development progress

Environment: RHEL 6, DSE 4.6, Datastax Opscenter 5.1, Maven 3, Jenkins, Puppet, Cassandra 2.1.1, Eclipse, PuTTy, J2EE.

Confidential, NDSU, Fargo, ND

Hadoop/Java Developer/Big Data Analyst

Responsibilities:
  • Migration of data using Sqoop from RDBMS (MySQL) to HDFS on regular basis from various sources.
  • Implemented the Hive queries for aggregating the data and extracting useful information by sorting the data according to required attributes.
  • Developed back end REST API for CRUD operations using using Apache Hbase client library
  • Worked on implementing Partition, Dynamic Partition and Buckets in Hive for efficiently accessing data.
  • Used R library and hive-json-serde to analyze twitter feeds in its native JSON file format
  • Created Flume agents to ingest Twitter feeds into HDFS
  • Used Hive and created tables and involved in data loading and writing Hive Scripts
  • Involved in creating Hive tables, dynamic partition, buckets loading with data and writing hive queries
  • Developed statistical prediction models using linear, non-linear regression, variable selection methods, multivariate regression techniques in R
  • Analyzed large data sets with mixed data type including structured and unstructured data

Environment: Hadoop (HDFS/MapReduce), Sqoop, Pig, UDF, HBase, Datastax Cassandra, CDH 4.2, Twitter API, Hive ,HQL,R, R-studio, Linux, X2go Client, WinSCP, Putty

Confidential

Software Developer

Responsibilities:
  • Involved in the complete Software development life cycle (SDLC) of the Multi-tier application from requirement analysis, reviewing to testing using Object Oriented Design and development methodology.
  • Implemented the Application using spring MVC Framework , JSP and Servlet, HTML, JavaScript and CSS
  • Use of Joins, Triggers, Stored Procedures and Functions in order to interact with backend database using SQL and JDBC drivers.
  • Involved in setting of development and review environment, which involved installation and configuration of Apache Tomcat server on Debian Linux server

Environment: Java, JavaScript, Html, Apache Tomcat, T-SQL, Debian Linux, UML, JUnit

We'd love your feedback!