Hadoop Developer Resume
Piscataway New, JerseY
PROFESSIONAL SUMMARY:
- Around 8 years of experience in Big Data Technologies using Java and Python and Data Analytics.
- Good hands on experience in most of the programming languages like Python, R, Java, Scala and SAS.
- Experience in NoSQL database which is a part of Research and Implement ongoing project with requirements.
- Good hands on experience in building efficient mapreduce programs on Apache Hadoop for performing jobs and analysing big data.
- Intensive hands on experience with Sqoop to import data from Microsoft SQL to HDFS and vice - versa.
- Good experience in BigData development tools like HBase, PIG, HIVE, Sqoop, Storm, Oozie and flume.
- Experience as a Data Engineer, can perform data transformations, extractions, cleaning and loading.
- Very good experience in Data Visualization using tools like Tableau, Plotly and QlikView.
- Good experience in Visualizing with Python, R and SAS using different libraries like MatplotLib.
- Good knowledge and experience with Machine Learning Algorithms.
- Experience in writing Shell Scripts for automation of Mapreduce Jobs.
- Experience in building Hadoop and different big data environments and maintain clusters.
- Good Knowledge in NoSQL Databases and can perform different transctions.
- Built custom ETL to Extract Transform and Load the data from MsSQL to Hdfs.
- Experience in building applications using Python and Java.
- Good knowledge in Artificial Intelligence applications and performed Image Recognition.
- Experience in building and understanding Natural Language Processing Programs
- Practical Experience and Knowledge in Data Mining and Text analytics using Python and R.
- Good hands on experience in Programming languages to SQL application developments and deployment
- Experience in Apache Spark with Scala, Python and Java.
- Knowledge and participated in Fraud Detection, Anti Money Laundering and Know Your Customer Projects.
- Experience in ETL and Querying in Big Data tools like Pig Latin and Hive QL
- Knowledge in Big Data Machine Learning toolkits such as Mahout and Spark ML
- Good hand on experience in Handling different data sets like JSON and XML
- Good Knowledge and experience with RESTful API’s.
- Good hands on experience with MarkLogic NoSQL and attended course for the same
- Experience in Data Extraction programs like crawling and scrapping using Python.
- Knowledge in different NoSQL databases and performing BA with visualization tools.
- Experience in using software development environments like Eclipse and Netbeans.
- Good Problem solving skills and code optimization skills to get efficient output.
- Good programming knowledge from the experience and research on various languages which makes adaptable to any programming language.
TECHNICAL SKILLS:
Programming: Python, Java, R, SAS, Scala
Database: Microsoft SQL, MySql, Oracle
NoSql: MongoDB, Marklogic, Cassandra
Big Data: Hadoop, HDFS, MapReduce, Sqoop, Pig, Hive, Hbase, Spark
Visualization/Analytical Tools: Tableau
Operating Systems: Linux, Windows
Scripting: Shell Scripting
Other Skills: Major Machine Learning Algorithms, RESTful Webservices, Xquery, XSLT
PROFESSIONAL EXPERIENCE:
Confidential, Piscataway, New Jersey
Hadoop Developer
Responsibilities:
- Automated the daily data mining process of user’s details from transaction details
- Built a Hadoop cluster for handling huge datasets everyday with 10 nodes.
- Execute Sqoop Commands to extract data from Ms SQL to HDFS
- Built a MapReduce program to perform jobs for extracting user’s details.
- Used Java to build MapReduce program and used Aho-Corasick algorithm for string search.
- Built a shell script to run these jobs daily in order.
- Migrated the same code into Scala and testing with Spark.
- Used Tableau to showcase the user’s density state wise.
Environment: Linux, Hadoop, MapReduce, Yarn, Apache Spark, MsSQL, Sqoop, HDFS, Java, Scala, Shell Scripting
Confidential, Atlanta, GA
Big Data Analyst and Developer
Responsibilities:
- Mine social media content and extract relevant information based on user’s search query.
- NLP (Natural Language Processing) on this data using Python and libraries NLTK, CoreNLP.
- Also used R for deep learning the data and look for any patterns
- Developed Artificial Intelligence application for Image Recognition
- For Image Recognition used Microsoft Cognitive toolkit and Convolution Neural Network.
- Stored the data in MongoDB NoSQL and performed different transactions.
Environment: Hadoop, Ms SQL, Excel, Python, R, Mongo DB, Microsoft Congnitive Toolkit.
Confidential, Buffalo, NY
Data Scientist
Responsibilities:
- Analysed and evaluated the KYC (Know Your Customer) documents.
- Identify the patterns using Text mining and text analytics
- Used R to perform analytics and mining the data.
- Implement different machine learning algorithms and observe for an outliers
- Used Python to perform data cleaning and make data structured.
- Part of AML (Anti-Money Laundering) team to build and implement rules.
- Transform unstructured transactions dataset into SAS dataset
Environment: Hadoop, MsExcel, MapReduce, Python, R, SAS, Marklogic, Tableau.
Confidential
Big Data Developer
Responsibilities:
- Built complex MapReduce Program to run jobs on this large transactions data using Java.
- Used Hadoop to store the large sets of unstructured data in HDFS
- Handled to import data into HDFS from different databases using Sqoop.
- Transform the data using Hive and Mapreduce and load into HDFS.
- Performed real time streaming of data using Spark with Kafka
Environment: Hadoop, HDFS, MapReduce, Yarn, Java, Sqoop, Apache Spark, Kafka, Oozie.
Confidential
Data Analyst
Responsibilities:
- Create Tableau Dashboards from the finance dataset.
- Implement R scripts in Tableau for creating calculated parameters.
- Build a small hadoop environemt for testing the performance of big data
- Wrote Pig scripts and HiveQL queries and perform analytics on it.
- Crawled and scrapped the data from different websites and stored the data in database using Python
- Built different visualizations to showcase the analysis did in Python using MatplotLib, Scikit learn.
Environment: Windows, Linux (Ubuntu), SQL, Tableau, Hadoop, PIG, Hive, Python.