Data Scientist Resume
0/5 (Submit Your Rating)
Chicago, IL
SUMMARY:
- Master’s Degree with 5+ years of data analysis experience using various analytic, statistical, information processing and data management tools
- Experienced with analytic tools such as SAS, R, SQL, SPSS and other OLAP tools
- Leveraging Hadoop, Map/Reduce, Hive, Aster, Pig, Hbase and other big data supporting technologies
- Writes and runs complex SQL scripts on Oracle, MySQL and Teradata database servers to extract records and analyze them
- Utilized advanced supervised and unsupervised learning methods of machine learning algorithm and statistical techniques like multiple regression, Bayes’ classifier, k - means clustering, principal component analysis and visualization techniques for pattern recognition or predictive modeling in statistical computing environments
- Conducted model development, validation and scoring of predictive models, such as Leave-One-Out-Cross validation (LOOC), k-fold cross validation or bootstrap on a validation set
- Data preparation for various statistical modeling, which includes data cleansing, descriptive statistics, missing data analysis, data validation and preliminary data reporting using SPSS modeler and SAS Enterprise Miner
- Sound statistical knowledge to interpret and diagnose models and make valid conclusions from volumes of data
- Proficiency in data visualization skills and present the models and results to non-technical teams
- High-level experience in Base SAS, SAS/Macros, SAS/SQL, SAS/Stat, SAS/Connect, SAS/Access, SAS/Graph, SAS/ODS, SAS/OLAP, SAS/EBI, SAS/ETS and SPSS modeler.
- Extensive experience in preparation of reports, tables, listing and graphs
TECHNICAL SKILLS:
Modeling Tools: SAS, R, SPSS, Advance Excel
Data Visualization Tools: Tableau, MS Visio
RDBMS and Database: MS Access, SQL Server, MySQL, Teradata
Programming language: C, Java, HTML, JavaScript, Hadoop, MapReduce, Hive, Hbase, Pig, Aster
Operating systems: UNIX, LINUX
PROFESSIONAL EXPERIENCE:
Data Scientist
Confidential, Chicago, IL
Responsibilities:
- Preprocessed big data set, including collecting, cleaning and transforming, in Map/Reduce and Pig on Hadoop cluster by utilizing linux commends
- Performed NLTP (Natural Language Text Processing) in PostgreSQL on Teradata databases to extract key words from customers’ reviews
- Built multiple predictive models and comparing them by scoring
- Conducted model validation, model selection and predicted customers’ potential behaviors based on given features
- Worked out strategies based on the predictions to improve sales and fulfill business goals
- Generated report and visualized the results using ggplot2 and GoogleVis in R and Tableau
- Conducted advanced analytical tasks in machine learning field, such as NN(neural network) and GA(genetic algorithm), to predict potential risk and the benefit
Data Analyst
Confidential, Rochester, NY
Responsibilities:
- Translated business demands into data mining problems by interacting with business teams
- Performed ETL on social media data from various sources in SAS
- Conducted market segmentation based on history performance records using several classification models (Bayes classifier and support vector classifier) and unsupervised learning methods given a new product in advanced SAS
- Processed model validation using k-fold cross validation and bootstrap
- Targeted potential market and gave an optimized match of promotions
Researcher
Confidential, Evanston, IL
Responsibilities:
- Developed Peer Review model as a learning & assessment approach, created experimental dataset on MySQL, formulated quality assurance strategies, took in charge of designing algorithms & testing the online system
- Implemented the system in some programming courses in Confidential in Confidential University since 2011, substituted traditional assessment approach, created the strategies to improve the model & system according to feedbacks
- Preprocessed raw experimental data and analyze it regarding to students behaviors, conducted questionnaire analysis and worked out learning outcome report, presented relevant s on international conference Confidential, Singapore
- Generated the paper and published it on Computer& (Ranked #1 in Computer Journals)
- Extended the system and improved it by utilizing some data mining methods, such as comparing 3 algorithms on reviewer assignment moment with K-means clustering
- Led a cross-functional team and processed Peer Review program, trained junior research assistants