We provide IT Staff Augmentation Services!

Data Scientist Resume

0/5 (Submit Your Rating)

Chicago, IL

SUMMARY:

  • Master’s Degree with 5+ years of data analysis experience using various analytic, statistical, information processing and data management tools
  • Experienced with analytic tools such as SAS, R, SQL, SPSS and other OLAP tools
  • Leveraging Hadoop, Map/Reduce, Hive, Aster, Pig, Hbase and other big data supporting technologies
  • Writes and runs complex SQL scripts on Oracle, MySQL and Teradata database servers to extract records and analyze them
  • Utilized advanced supervised and unsupervised learning methods of machine learning algorithm and statistical techniques like multiple regression, Bayes’ classifier, k - means clustering, principal component analysis and visualization techniques for pattern recognition or predictive modeling in statistical computing environments
  • Conducted model development, validation and scoring of predictive models, such as Leave-One-Out-Cross validation (LOOC), k-fold cross validation or bootstrap on a validation set
  • Data preparation for various statistical modeling, which includes data cleansing, descriptive statistics, missing data analysis, data validation and preliminary data reporting using SPSS modeler and SAS Enterprise Miner
  • Sound statistical knowledge to interpret and diagnose models and make valid conclusions from volumes of data
  • Proficiency in data visualization skills and present the models and results to non-technical teams
  • High-level experience in Base SAS, SAS/Macros, SAS/SQL, SAS/Stat, SAS/Connect, SAS/Access, SAS/Graph, SAS/ODS, SAS/OLAP, SAS/EBI, SAS/ETS and SPSS modeler.
  • Extensive experience in preparation of reports, tables, listing and graphs

TECHNICAL SKILLS:

Modeling Tools: SAS, R, SPSS, Advance Excel

Data Visualization Tools: Tableau, MS Visio

RDBMS and Database: MS Access, SQL Server, MySQL, Teradata

Programming language: C, Java, HTML, JavaScript, Hadoop, MapReduce, Hive, Hbase, Pig, Aster

Operating systems: UNIX, LINUX

PROFESSIONAL EXPERIENCE:

Data Scientist

Confidential, Chicago, IL

Responsibilities:

  • Preprocessed big data set, including collecting, cleaning and transforming, in Map/Reduce and Pig on Hadoop cluster by utilizing linux commends
  • Performed NLTP (Natural Language Text Processing) in PostgreSQL on Teradata databases to extract key words from customers’ reviews
  • Built multiple predictive models and comparing them by scoring
  • Conducted model validation, model selection and predicted customers’ potential behaviors based on given features
  • Worked out strategies based on the predictions to improve sales and fulfill business goals
  • Generated report and visualized the results using ggplot2 and GoogleVis in R and Tableau
  • Conducted advanced analytical tasks in machine learning field, such as NN(neural network) and GA(genetic algorithm), to predict potential risk and the benefit

Data Analyst

Confidential, Rochester, NY

Responsibilities:

  • Translated business demands into data mining problems by interacting with business teams
  • Performed ETL on social media data from various sources in SAS
  • Conducted market segmentation based on history performance records using several classification models (Bayes classifier and support vector classifier) and unsupervised learning methods given a new product in advanced SAS
  • Processed model validation using k-fold cross validation and bootstrap
  • Targeted potential market and gave an optimized match of promotions

Researcher

Confidential, Evanston, IL

Responsibilities:

  • Developed Peer Review model as a learning & assessment approach, created experimental dataset on MySQL, formulated quality assurance strategies, took in charge of designing algorithms & testing the online system
  • Implemented the system in some programming courses in Confidential in Confidential University since 2011, substituted traditional assessment approach, created the strategies to improve the model & system according to feedbacks
  • Preprocessed raw experimental data and analyze it regarding to students behaviors, conducted questionnaire analysis and worked out learning outcome report, presented relevant s on international conference Confidential, Singapore
  • Generated the paper and published it on Computer& (Ranked #1 in Computer Journals)
  • Extended the system and improved it by utilizing some data mining methods, such as comparing 3 algorithms on reviewer assignment moment with K-means clustering
  • Led a cross-functional team and processed Peer Review program, trained junior research assistants

We'd love your feedback!