Machine Learning Resume NY - Hire IT People

SUMMARY:

I have 8+ years of work experience designing, building and implementing analytical and enterprise application using machine learning, Python, R, Scala,and Java.
GoodExperience with a focus onBig data, Deep Learning, Machine Learning, Image processing or AI.
Very good hands - on in Spark Core, Spark SQL, Spark Streaming and Spark machine learning using Scala and Python programming languages.
Has very good experience implementing and handling end - to - end data science products.
Good experience in periodic model validation and optimization workflows for the data science products developed.
Good experience in extracting and analyzing the very large volume of data covering a wide range of information from a user profile to transaction history using machine learning tools.
Collaborated with engineers to deploy successful models and algorithms into production environments.
Good understanding of model validation processes and optimizations.
An excellent understanding of both traditional statistical modeling and Machine Learning techniques and algorithms like Regression, clustering, ensembling (random forest, gradient boosting), deep learning (neural networks), etc.
Proficient in understanding and analyzing business requirements, building predictive models, designing experiments, testing hypothesis, and interpreting statistical results into actionable insights and recommendations.
Fluency in Python with working knowledge of ML & Statistical libraries (e.g. Scikit-learn, Pandas).
Experience in processing real-timedata and building ML pipelines end to end.
Very Strong in Python, statistical analysis, tools, and modeling.
Very good hands-on experience working with large datasets and Deep Learning algorithms using apache spark and TensorFlow.
An excellent understanding of both traditional statistical modeling and Machine Learning techniques and algorithms like Regression, clustering, ensembling (random forest, gradient boosting), deep learning (neural networks), etc.
Good knowledge of recurrent neural networks, LSTM networks,and word2vec.
Goodexperience in refining and improving our image recognition pipeline.
Deep interest in learning both the theoretical and practical aspects of working with and deriving insights from data.
Developed highly scalable classifiers and tools by leveraging machine learning, Apache spark & deep learning
Worked under the direction of CSO to develop an effective solution to a predictive analytics problem, testing a number of potential machine learning algorithms of apachespark.
Good experience in extracting and analyzingthe very large volume of data covering a wide range of information from a user profile to transaction history using machine learning tools.
Built state-of-the-art statistical procedures, algorithms,and models to solve a range of problems in diverse domains.
Proficient code writing capability in a major programming language such as Python, R, Java,and Scala.
Good experience with deep learning frameworks like Caffe and TensorFlow.
Experience using Deep Learning to solve problems in Image or Video analysis.
Good understanding of Apache Spark features& advantages over map reduce or traditional systems.
Very good hands-on in Spark Core, Spark SQL, Spark Streaming and Spark machine learning using Scala and Python programming languages.
Solid Understanding of RDD Operations in Apache Sparki.e. Transformations & Actions, Persistence(Caching),Accumulators, Broadcast Variables.
In-depth understanding of Apache spark job execution Components like DAG, lineage graph, Dag Scheduler, Task scheduler, Stages and task.
Developed highly scalable classifiers and tools by leveraging machine learning, Apache spark & deeplearning.
Highly organized and detail oriented, with a strong ability to coordinate and track multiple deliverables, tasks,and dependencies.
Experience in exposing Apache Spark as web services.
Worked under the direction of CSO to develop an effective solution to a predictive analytics problem, testing a number of potential machine learning algorithms of apache spark.
Experience in real-time processing using Apache Spark and Kafka.
Have good working experience of No SQL database like Cassandra and MongoDB.
Delivered at multiple end-to-end Bigdata analytical based solutions and distributed systems like Apache Spark.
Experience leveraging DevOps techniques and practices like Continuous Integration, Continuous Deployment, Test Automation, Build Automation and Test
Hands on experience leading delivery through Agile methodologies
Experience in managing code on GitHub
Good hands on experience on Spring & Hibernate framework.
Solid understanding of object-oriented programming.
Familiarity with concepts of MVC, JDBC, and RESTful.
Familiarity with build tools such as Maven and SBT.
Knowledge of Information Extraction, NLP algorithms coupled with Deep Learning

TECHNICAL SKILLS:

Languages: Python,R,Scala,and Java

Spark ML,Spark MLLib, Scikit: Learn. NLTK & Stanford NLP

Deep learning framework: TensorFlow

Big Data Frameworks: Apache Spark,Apache Hadoop, Kafka, Mongo DB,Cassandra.

Machine learning: Linear regression, Logistic Regression, Naive Bayes, SVM, Decision Trees, Random Forest, Boosting, Kmeans,Bagging etc.

Big data Distribution: Cloudera & Amazon EMRCloud

Web Technologies: Flask,Django and spring MVC

Front End Technologies: JSP, HTML5, Ajax, JQuery and XMLServers

Web server: Apache2, Nginx Web Sphere,and Tomcat

Visualization Tool: Apache Zeppelin, Matplotlib,and Tableau.

Databases: Oracle, MySQL,and PostgreSQL.

No SQL: MongoDB and Cassandra

Operating Systems: Linux and windows

Scheduling Tools: Airflow &oozie.

PROFESSIONAL EXPERIENCE:

Confidential

Machine Learning

Responsibilities:

Converted data from PDF to XML using python script in two ways i.e. from raw xml to processed xml and from processed xml too.CSV files.
Developing a generic script for the regulatory documents.
Used python Element Tree(ET) to parse through the XML which is derived from PDF files.
Data which is stored in sqlite3 datafile(DB.) were accessed using the python and extracted the metadata,tables,and data from tables and converted the tables to respective CSV tables.
Used the XML tags and attributes to isolate headings,side-headings,and subheadings to each row in CSV file.
Used Text Mining and NLP techniques find the sentiment about the organization.
Deployed a spam detection model and performed sentiment analysis of customer product reviews using NLP techniques.
Developed and implemented predictive models of user behavior data on websites, URL categorical, social network analysis, social mining and search content based on large-scale MachineLearning.
Developed predictive models on large-scale datasets to address various business problems through leveraging advanced statistical modeling, machine learning,and deep learning.
Extensively used Pandas, NumPy, Seaborn, Matplotlib, Scikit-learn, SciPy and NLTK in R for developing various machine learning algorithms.
Used R programming language for graphically critiquing the datasets and to gain insights to interpret the nature of the data.
Researching on Deep Learning to implement NLP
Clustering, NLP, Neural Networks. Visualized and presented the results using interactive dashboards.
Involved in the transformation of files from GITHUB to DSX.
Involved in the execution of CSV files in Data Science Experience.
The major part is like being a part of the project, importing the converted CSV file to Confidential internal API which is InfoSphere Information Governance Catalog
Used Beautiful Soup for web scraping (Parsing the data)
Developed the code to capture the description which comes under headings of index section to the description column of CSV row.
Used some other python libraries like PDFMiner, PyPDF2, PDFQuery, Sqlite3.
Converted the uni-code to a nearest possible string (ASCII value) using Uni-decode module.
Adding a column to each CSV row which gives the parent Index number of the given row.

Environment: R Studio, AWS S3, NLP, EC2, Neural networks, SVM, Decision trees, MLbase, ad-hoc, MAHOUT, NoSQL, Pl/SQL, MDM, MLLib & Git.

Confidential, NY

Data Scientist

Responsibilities:

Performed data exploratory, data visualizations, and feature selections using Python and Apache Spark.
Scaled Scikit-learn machine learning algorithms using apache spark.
Using techniques such as Fast Fourier Transformations, Convolution Neural Networks,and Deep learning.
I develop Deep Convolution and Recurrent Neural Networks with TensorFlow and have significant Risk Management & Quantitative Finance experience.
Used multiplemachine learning algorithms, including random forest and boosted tree, SVM, SGD, neural network, and deep learning using TensorFlow.
Used Python, Convolution Neural Networks (CNN), Deep Belief Networks (DBN), Theano, cafe etc.
Applied unsupervised and supervised learning methods in analyzing high-dimensional data. Proficient use of Python Scikit-learn, pandas, and NumPy packages.
Performed data modeling operations using Power Bi, Pandas, and SQL.
Utilized Python libraries wxPython, NumPy, Twisted and matplotlib
Used python libraries like Beautiful Soup and matplotlib.
Developed and implemented predictive models of user behavior data on websites, URL categorical, social network analysis, social mining and search content based on large-scale Machine Learning,
Wrote scripts in Python using Apache Spark and ElasticSearch engine for use in creating dashboards visualized in Grafana.
Lead development for Natural Language Processing (NLP) initiatives with chat-bots and virtual assistants.
Developed highly scalable classifiers and tools by leveraging machine learning, Apache spark & deep
Converted Pandasdata frame dataset to apache spark data frame.
Used multiple machine learning algorithms, including random forest and boosted tree, SVM, SGD, neural network, and deep learning using TensorFlow.
Collaborated with engineers to deploy successful models and algorithms into production environments.
Collaborated with a diverse team that includes statisticians, Chief Science Officer,and engineers to build data science project pipelines and algorithms to derive valuable insights from current and new datasets.
Used PySparkdata frame to read text data,CSV data,Image data from HDFS, S3,andHive.
Cleaned input text data using PySpark Machine learning feature exactions API.
Created features to train algorithms.
Used various algorithms of PySparkMLAPI.
Trained model using historical data stored in HDFS and Amazon S3.
Used Spark streaming to load the trained model to predict real-time data from Kafka.
Stored the result in MongoDB.
Utilized various new supervised and unsupervised machine learning algorithms/software to perform NLP tasks and compare performances
The web application can pick data which is stored in MongoDB.
Used Apache Zeppelin to visualization of Big Data.
Fully automated job scheduling, monitoring, and cluster management without human interventionusing airflow.
Build apache spark as Web service using a flask. worked with input file formats like an orc, parquet, Json, Avro.
Developed highly scalable classifiers and tools by leveraging machine learning, Apache spark & deep learning.
Wrote Spark SQL UDFs, Hive UDFs.
Optimized Spark coding suing Performance Tuning of apache spark.
Optimized machine learning algorithms based on need.
Used amazon elastic MapReduce (EMR) to process a hugenumber of datasets using Apache spark and TensorFlow.

Environment: Machine learning, Scikit-learning,Pandas, Spark core, Spark SQL, Spark streaming, Python, airflow,Amazon EMR, ec2, s3,pandas,NumPy,matplotlib, TensorFlow,Kafka,flask,MongoDB,Hive, HDFS,GitHub, REST & airflow.

Confidential - Columbus, OH

Data Scientist

Responsibilities:

Collaborated with internal stakeholders to understand business challenges and develop analytical solutions to optimize business processes.
Performed analysis using industry leading text mining, data mining, and analytical tools and open source software.
Used MATLAB, C/C++ with OpenCV and SVM, Neural Networks, Random Forest as classifiers.
Generated graphical reports using python package NumPy and matplotlib.
Built various graphs for business decision making using Pythonmatplotlib library.
Knowledge of Information Extraction, NLP algorithms coupled with Deep Learning (ANN and CNN), Theano, Keras,andTensorFlow.
Built and trained a deep learning network using TensorFlow on the data, and reduced wafer scrap by 15%, by predicting the likelihood of wafer damage. A combination of the z-plot features, image features (pigmentation) and probe features are being used.
Experienced in ArtificialNeuralNetworks(ANN) and DeepLearning models using Theano, TensorFlow and Keras packages using Python.
Used Natural Language Processing (NLP) to pre-process the data, determine the number of words and topics in the emails and form cluster of words
Cleaned input text data using PySparkMachine learning feature exactions API.
Used Pandas data frame for exploratory data analysis on sample dataset.
Wrote Scikit learn based machine learning algorithms for building POC’s on sample dataset.
Analyzedstructured, semi-structured and unstructured dataset using map-reduce and apache spark.
Implemented end to end lambda architecture to analyze streaming and batch dataset.
Used Apache Mahout’s scalable machine learning algorithms for building recommendation engine, for building classification and regression model.
Converted mahout’s machine learning algorithms to RDD based ApacheSparkMLLib to improve performance.
Optimizedmachine learning algorithms based on need.
Automatic music/news/POI recommendation inside the vehicle by using GPS location, passenger conversation, behavior,and mood. Using machine learning and natural language.
Smart state-of-charge monitor for electric vehicles based on RecurrentNeuralNetwork and Seq2Seq forecast.
Build multiple features of machine learning using python, Scala,andJava based on need.
Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
Migrated single machine learning machine learning algorithms to Parallel processing algorithms.
Developed Hive queries for ad-hoc analysis.
Used amazon elastic MapReduce (EMR) to process a hugenumber of datasets using Apache spark and TensorFlow.
Lead Data Scientist for development of Machine Learning and NLP engines utilizing health population Data.
Analyzed the partitioned and bucketed data and compute various metrics for reporting.
Involved in loading data from RDBMS and weblogs into HDFS using Sqoop and Flume
Involved in building complex streaming data Pipeline using Kafka and Apache Spark.
Worked on loading the data from MySQL to HBase where necessary using Sqoop.
Exported the result set from Hive to MySQL using Sqoop after processing the data.
Optimized hive queries.
Optimized MapReduce and apache spark jobs.
Wrote custom input formats in map reduce to analyses image dataset.
Wrote Hive UDF’s based on need.

Environment: Hadoop, Map Reduce, Hive, Mahout, Apache Spark, Python, Scikit learn, Pandas, NumPy, Java, Maven, Eclipse, MySQL, Kafka,Sqoop & Flume.

Confidential, NY

Data Scientist

Responsibilities:

Responsible for performing Machine-learning techniques regression/classification to predict the outcomes.
Responsible for design and development of advanced R/Python programs to prepare to transform and harmonize data sets in preparation for modeling.
Designed and automated the process of score cuts that achieve increased close and good rates using advanced R programming.
Utilized Convolution Neural Networks to implement a machinelearning image recognition component.
Managed datasets using Panda data frames and MySQL, queried MYSQL relational database(RDBMS) queries from python using Python-MySQL connector MySQL dB package to retrieve information.
Utilized standard Python modules such as csv, itertools,and pickle for development.
Tech stack is Python 2.7/PyCharm/Anaconda/pandas/NumPy/unittest/R/Oracle.
Developed large data sets from structured and unstructured data. Perform data mining.
Partnered with modelers to develop data frame requirements for projects.
Performed Ad-hoc reporting/customer profiling, segmentation using R/Python.
Tracked various campaigns, generating customer profiling analysis and data manipulation.
Provided python programming, with detailed direction, in the execution of data analysis that contributed to the final project deliverables. Responsible for data mining.
Analyzed large datasets to answer business questions by generating reports and outcome.
Worked with a team of programmers and data analysts to develop insightful deliverables that support data-driven marketing strategies.
Executed SQL queries from R/Python on complex table configurations.
Retrieving data from the database through SQL as per business requirements.
Create, maintain, modify and optimize SQL Server databases.
Manipulation of Data using python Programming.
Adhering to best practices for project support and documentation.
Understanding the business problem, build the hypothesis and validate the same using the data.
Managing the Reporting/Dashboarding for the Keymetrics of the business.
Involved in data analysis using different analytic techniques and modeling techniques.

Environment: Python,Oracle,Python, Scikit learn,Pandas, NumPy, SciPy, NLTK,Jupyter notebook,R,and Studio

Confidential

Data Analyst/Data Modeler

Responsibilities:

Developed end to end enterprise Applications using Spring MVC, REST and JDBC Template Modules.
Written well designed testable, efficient java code.
Understanding and analyzing complex issues and addressing challenges arising during the software development process, both conceptually and technically.
Implemented best practices of Automated Build, Test and Deployment.
Developed design patterns, data structures,and algorithms based on project need.
Worked on multiple tools such as Toad, Eclipse, SVN, Apache,andTomcat.
Deployed models via APIs into applications or workflows
Worked on User Interface technologies like HTML5, CSS/SCSS.
Wrote Stored procedure and SQL queries based on project need.
Deployed built jar into the application server.
Created Automated Unit Tests using Flexible/Open Source Frameworks
Developed Multi-threaded and Transaction Handling code (JMS, Database).

Environment: Java, Spring MVC, Hibernate, JMS, HTML5, CSS/SCSS, Junit, Eclipse, Tomcat,and Oracle.

We provide IT Staff Augmentation Services!

Machine Learning Resume

NY

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship