We provide IT Staff Augmentation Services!

Data Scientist Resume

3.00/5 (Submit Your Rating)

SUMMARY:

Confidential is a data scientist, enterprise information architect, big data architect & global data strategy leader. He is an inventor for customer - focused, innovative, adaptive clients. He is comfortable communicating with senior business leaders and senior IT management to deliver world class systems. His specialties are in Machine Learning, AI, Linguistics, NLP, UML, Conceptual Modeling, Logical Modeling, and Semantic Web Ontological Modeling; Enterprise Search Strategy & Architecture

TECHNOLOGY SUMMARY:

Linguistic Algorithms: R tm, Python NLTK, Gensim, SpaCy, Sense2vec, Triplets, Linguistics Analysis Services, NLP, CL, Collocation AnalysisGenerative Patterns, Dependency Grammars, SLING, DRAGNN, SyntaxNet, sonnet

Data Science & Viz: R, RStudio, SPSS Modeler, SPSS, QlikView, MDX, Tableau, Anaconda Spyder, SAS, BIRT, SSAS, SAS EM

AI/ Neural Nets: TensorFlow, Keras, Word2vec, Doc2vec, CNNs, ANNs, LSTM, RNNs, GANs, Theano, Torch, Bidirectional LSTM Ontology, Semantic Web

Modeling: Magic Draw Visual Ontology Modeler, Smartlogic, RDF, RDFA, Turtle, SKOS, OWL, OWL2, SPARQL, Linked DataNeo4j, Open World Lexicography Assumptions, Ontology Frameworks, Revelytix, Protege

Architectural Modeling: Magic Draw, Mega, Troux, IBM Rational, Sparx, ArchiMate, Planview, Orbussoftware, ARIS

UML Modeling: Magic Draw, Zachman, TOGAF, RUP, ArgoUML, RSA

Glossary, Models: BG, ACORD Framework, Automotive All Divisions; Healthcare; Utility; Gas & Oil Process; GRC; Enterprise, Universal

Big Data, NoSQL: MemSQL, Hortonworks, Cloudera, Intel, Sqrrl, Cassandra, MarkLogic, Couchbase, Cloudant, Alpine LabsDataStax, MongoDB

Pivotal Platform: HAWQ, Gemfire, Spring XD, MADlib, PivotalR, Greenplum, PostgreSQL, PL/R, Pythonu, plpy

Watson: Watson Knowledge Studio, Watson Explorer, Watson Explorer Content Analytics

Data Modeling: IBM IDA, UML, ERwin, Sandhill, Rational Data Architect, Star Schema, Snowflake, Power-Designor, NavigatorJames Martin, IEF, IEW, 3rd Normal Form, ER/Studio, SA, RDA, ADRM, Big Data Modeling Techniques

ETL, ELT, ETML, EAI: Sqoop, pig, Apache NiFi, Composite, Information Server, Informatica, ETI, Data Stage EE, SAS ETL, SSIS 2008; Talend

Data Profiling/Quality: Exeros/ CA ERwin Data Profiler (now IBM Optim), Evoke AXIO (now Informatica Data Explorer), SAS, Profile StageInformation Analyzer, Talend Open Profiler; BODS Data Profiler, EIM, Information Steward; Trillium

Metadata: MITI MIMB, Unicorn, IBM Metadata Workbench, MetaStage, Ron Ross, Platinum Repository, Global IDS

Database: MemSQL, Pivotal Hawq, Hortonworks, Redis, Kudu, z/OS DB2, UDB DB2, SQL Server, MySQL, PostgreSQL, Hana

Graph Databases/Layers: AllegroGraph, GraphLab/Dato, Giraph, GraphX, Neo4j

Analytics Hardware: MemSQL, Netezza, iSAS, Oracle Exadata, EMC Greenplum, Quad/Dual Core, zSeries, S390, pSeries, Linux

EDW/DM Methodologies: Kimball Conformed Dimensions, Chris Adamson, Inmon CIF

Enterprise Search: Solr, LucidWorks, Coveo, Attivio, MindBreeze, Sinequa, Smartlogic, Endeca, Lucene, GPText, Nutch, hakia, GSAElasticsearch, FAST, Tika, Gora, Avro, PIO, Open-NLP, Stanford Core-NLP

ML, Data Mining: R tm, NLTK, SpaCy, Gensim, SAS EM, IBM Intelligent Miner, SQL Server Data Miner, Predixion

Programming Languages: C, C++, C#, Python, SQL, Go, Scala, Julia, J2EE, Perl, Bash, JMS, Ruby on Rails, Clojure, JavaScript, React.js

PROFESSIONAL EXPERIENCE:

Confidential

Data Scientist

Responsibilities:

  • Developed solutions in 2018 subset of 200+ new R models and Python. Advised client on H2O.ai Python and R XGBoost and LIME solutions to billing and credit scenarios.
  • Advised client on Spark and Kafka architectures: MemSQL Replication with Spark. IOT strategy with Kafka.
  • Lead Data Scientist on Acxiom Demographics approaches to customer segmentation and Dunn & Bradstreet firmographic analytics for commercial and industrial clients in SAP Hana.
  • Customer analytics in Hana, RStudio, Anaconda SPYDER and in MemSQL.
  • Proposed Customer Lifetime Value analytics based on KYC, know your customer personalization, segmentation, event history, hazard and survival models combined with energy efficiency data.
  • Proposed Gaussian Mixture Model (GMM), K-Means Clustering and General Retention Models (GRM) for customer segmentation and personalization approaches.
  • Used CIM Utility model to vet architecture for analytics model. Agile facilitation of product owner user acceptance criteria communication with the BI/Analytics Team.
  • Python internal consulting and advising. Tableau analytics.
  • Participated in backlog meetings, sprints and sprint planning meetings.
  • Part of new team roll out of data science and analytics.
  • Requested architectural clarity and NewSQL product internal consulting for transition from in memory row store to column store approach for analytics, until planned IOT and real time approaches.
  • Accepted and trusted by the marketing research business leaders.
  • Proposed graph database and ontology modeling approach in Giraph and Protégé to address a director’s innovative ideas about mobile analytics for each customer as a self-serve model and for the customer account managers to employ in the field.
  • Partnered with gifted academic leader on this customer-friendly graph database analytics model.
  • Reviewed and analyzed ARIS enterprise architecture models for billing, time of use, inverted block key utility industry analysis to get up to speed with client processes.
  • Trained, installed, configured and mentored data scientists in SOLR AI and ML search capabilities;
  • Led the team in Python Dask coding Spark alternative approach.
  • Participated in one hot encoding and imbalanced data and imbalanced class discussions.
  • Vetted colleague’s innovative extreme value analysis, time series, and control chart models.
  • Use of git within VSTS. Trained data scientist in Solr; advised on search AI and ML capabilities; Solr up and running.

Confidential

Enterprise Search Consultant

Responsibilities:

  • Enterprise Search Proof of Concept to multi-state rollout; product evaluation based on vendor selection methodology; provided guidance on enterprise search architectural, functional, including capability mapping to functional criteria; technical capability; scoring guidance in all three areas; built new search capability in Solr 7 and Zookeeper for scalable search clusters; demonstrated Apache Foundation open source enterprise search capabilities to technology and business leaders.
  • Evaluated Hue compared to Kibana. Positioned open source machine learning search capabilities.
  • Created Enterprise Search Center of Excellence strategy and roadmap.
  • Positioned open source Stanford Core-NLP and OpenNLP for natural language processing.
  • Evaluated impact of ontologies on enhancing enterprise search capabilities.
  • Evaluated Microsoft Azure position supporting Elasticsearch or Solr.
  • Evaluated vendor managed service and platform and software as a service offerings.
  • Evaluated Amazon Cloud Search with A9, then with Solr and Elasticsearch.
  • Evaluated vendor machine learning and NLP capabilities.
  • Evaluated vendor cloud capabilities.
  • Created a Spark, Kafka, Solr high performance search architecture.
  • Created and led A3 quality process to guide team to add formal testing capabilities and personnel to the PoC, and to add open source enterprise search to measure and independently vet any enterprise search vendors.
  • Demonstrated Solr 7.2.1 up and running with Solritas dashboard with Velocity templating.
  • Provided guidance on Solr 7.3 Solrcloud.
  • Provided guidance on Solr DisMax Query Parser and Solr Extended DisMax Query Parser.
  • Evaluated Solr public websites.
  • Researched Search Use Case Capabilities: Personalization; Recommenders; Scoring; Ranking; Weighting; Boosting.
  • Provided leadership on building a full-time staff to support the Enterprise Search Center of Excellence.
  • Positioned one of the world’s top ontologists to advise the company on content curation and a metadata cleanup strategy based on ontology modeling with TopQuadrant and Protégé.
  • Provided risk assessment based on architectural request.
  • The risk was about prioritizing time to deliver with a one-vendor only strategy versus vetting more cloud, managed services, platform, infrastructure or software as a service vendors for enterprise search.
  • Proof of concept testing in XML, RESTful Services API Learned infrastructure and operations organization to identify, fund properly, and engineer realm security solutions with a domain administrator during the PoC. Lead to successful outcomes with Kerberos and NTLM for crawling SharePoint data.

Confidential

Data Scientist & Solution Architect

Responsibilities:

  • Python Coding in spaCy, genism, VADER sentiment analysis, TensorFlow, LSTM, scikit learn, numpy, scipy in Sublime Text 3; Consistent github and git user; Parallel programming in R and Python; specialized data science testing; analysis of data science presentations; formal documentation of data science artifacts; innovative data science model decomposition process modeling; data science strike team planning; collaboration planning; HAWQ big data SQL query coding and analysis.
  • Evaluated Watson Knowledge Studio, Content Analytics; First team to install ElasticSearch with Python. First individual to install ELK stack: Kibana up and running for search analytics.
  • Based on Elasticsearch JSON, improved unstructured data analytics, tf/idf, BM25/Okapi, relevancy ranking to seconds from hours.
  • Quality Pipeline: Enhancing 270 Python modules; over 1000 functions; Vetting, Ops all R, Python, and SPSS Modeler pipelines.
  • Horizon VM Client testing of Windows 10 for all major data science products features and functionality.
  • Guided Confidential staff for silent switch R, RStudio, and Anaconda installs.
  • Trained and engineered SPYDER solutions for business area.
  • Responsible for happiness of Data Science team first model area selection. Guided Alerts model data science process modeling.
  • Part of Data Science Champion Challenger methodology to rewrite all Big Four vendor Python models in R.
  • Singularly understood and re-engineered by testing Pivotal PL/Pythonu PostgreSQL code that also included Jython.
  • Provided linguistic guidance on dependency grammars and collocations.
  • Partnered with data lake data engineering team and with WIT QlikView talent to deliver vetted and scalable brilliant visualizations.
  • Worked with Palantir largescale data integration on AWS for customized data science and machine learning apps.
  • Principal Component Analysis, Topic Modeling.
  • Analysis of DTC and OBD codes for Connected Vehicle models.
  • Evaluation of SAS biostatistics models to apply to survival model evaluation.

Confidential

Senior Enterprise Information & Solution Architect

Responsibilities:

  • Led creation of Big Data Lab and Facilitated creation of Hortonworks, Tableau, and QlikView Lab Analytics
  • Assisted Confidential Healthcare Director in data architecture assessment meetings with a BCBS of Michigan architectural leadership team for the BCBSM Canonical Information Model Blueprint and Roadmap to build the business case.
  • DataStax, Sqrrl, Cloudera, Cloudant, Splunk, MarkLogic, Vertica, Hortonworks, Datameer, MongoDB, Couchbase, Riak, Cassandra, Accumulo, HBase, GSA, BigTable, SimpleDB, Hypertable, Neo4J, Intel, AllegroGraph, Tez, Stinger, Hive, Falcon, Zookeeper, Pig, Pig Latin, Ambari, Hive, Mahout, NLP, CL and SEO optimization algorithm research. Scala, R, Python, Anaconda programming.

Confidential

Senior Enterprise Information Architect / Semantic Web / Big Data / NoSQL Leader

Responsibilities:

  • Blueprinting, Roadmapping Enterprise Information Strategies and Enterprise Information Architecture Governance across all Healthcare, Life, and internal corporate domains.
  • Conceptual and Logical IDA modeling for Governance, Risk, and Compliance areas: ISO 2, PCI DSS; Domain-specific metrics / measurement analysis; EA Modeling Analysis in Mega; Semantic Web & Visual Ontology Modeling; Canonical Data Services Modeling Analysis, RDF for Healthcare Exchanges; OWL; TOGAF; IaaS Magic Draw Modeling.
  • FIBO; canonical modeling; RESTful Services, JSON; NoSQL; Big Data; Hadoop; MarkLogic; Thought leader to engage world’s top visual ontology modeler to teach Semantic Web and Ontology Master Class at world class global healthcare leader’s formal education and training facility. RDF, OWL, SKOS, Turtle, Graph Databases, Franz, TopQuadrant, MarkLogic, SmartLogic, MongoDB, Hortonworks.
  • Netezza Architecture. Vetted IBM MDM (former) Initiate team’s product. Vetted Palantir approach by bringing in one of the world’s top Semantic Web Ontology Leaders to train 40 architects in Dynamic Ontology Concepts

Confidential

Senior Enterprise Information Architect / Data Science Leader

Responsibilities:

  • Created Roadmap to integrate DMBOK Data Architecture within TOGAF 9, ISO/IEC 11179 with CIM in SPARX. Drove alignment of DMBOK data governance circle technologies to corporate capabilities and artifacts.
  • Evaluated Composite for data fabric / data services virtualization approach; evaluating industry data model approach: introduced ADRM utility industry data models to Confidential .
  • Introduced Tableau to Confidential: new approach to data visualization and discovery.
  • Evaluated Global IDs for end to end 10 layer metadata management. For Smart Currents, proposed Data Governance Framework, and participated in preparing EAG accepted Information Lifecycle slides for larger presentation imbued with Utility and Gas CIMs within an Enterprise Semantic Model.
  • Advised clients on data warehousing architecture, including shared nothing, data warehousing appliances, data profiling, data modeling, including high level modeling for planning complexity analysis.
  • Evaluated Netezza, IBM iSAS Smart Analytics, IBM Information Warehouse Builder, Exadata, EMC Greenplum, Teradata. Evaluated Netezza ESRI global mapping analysis capabilities and complete Netezza offering with IBM Netezza engineers.
  • Evaluated Troux Information for Data Lineage, Metadata Analysis, and Data Mapping to Enterprise Architecture landscapes; Sparx for CIM utility model to relational translation; evaluated Energy ICT scaling of Confidential: web performance vs. database scale model impacts; Itron, Teradata, Confidential Active SmartGrid Analytic Platform solution.
  • Evaluated nationwide utilities for data architecture capabilities; EMC Greenplum Data Science & Big Data architecture and engineering with EMC Data Science team.

Confidential

Senior Data Architect / Data Visualization / Data Modeler

Responsibilities:

  • Tableau Desktop & Server evaluation and creation of Reference Architecture; Led and engineered first Tableau Server Proof of Concept in GM Labs.
  • Positioning of Tableau and Excel Power Pivot for major BI infrastructure cost savings. Alignment with corporate data governance strategy.
  • Data analysis and data model validation analysis of corporate data models.
  • Architectural review of Cognos 8 projects; Framework Manager analysis; Business Layer / Transformation Layer analysis;

Confidential

Senior Enterprise Information Architect

Responsibilities:

  • Created data warehousing, data modeling, data profiling, metadata engineering and SSRS BI reporting and SharePoint integration Roadmap.
  • Created new enterprise data reporting layer architecture.
  • Data modeled a new data warehouse and dimensional models for cubes from scratch.
  • Engineered SSAS Cube in 2 months from Siebel CRM via Saphir/Safyr 5.0 and ERwin 7.3.10 with Source to Target Mappings for SSIS and SQL Server 2008 R2.
  • Mentored and Trained talented staff in Kimball and Hybrid Inmon Data Warehousing Concepts and EDW Engineering of New System created from scratch.
  • Created BI architecture for new Windows 2008 Server supported BI landscape. Led creation of Offline cubes in CubeSlice.
  • Set strategy and tactics of BI phased in cube development.
  • Led Agile SCRUM teams with guidance of a SCRUM master.
  • Drove timely purchase, installation, configuration, and execution of Exeros ERwin Data Profiler to profile data for column and pattern analysis in days, not weeks.
  • Created Saphir to ERwin Subject areas in days not months from Siebel for entire Siebel Business Objects, Application, custom Siebel, and database Layers.
  • Mentored team on Model-Driven Design Approach from ERwin.
  • Through excellence of approach, and outstanding management grasp of my Data-Driven Approach, extended consultancy twice Confidential initial contract.
  • Continually improved the cube-building process.
  • Presented data modeling and the cube to the CIO. Engineered SSRS, Performance Point, Excel Power Pivot Architecture guidelines and solutions for client build out.
  • Built strong Microsoft Engineering Team relationship to successfully implement an SSIS Package Deployment Framework with key MEDC Leaders and technology members.
  • Developed mission critical SSIS packages for EDW and cube development.
  • Demonstrated and enabled local MEDC team to use SSRS OLAP reporting that was accurate, timely and productionized.
  • SSRS replaced Business Objects / Crystal and potentially OBIEE. Led and mentored government and economic staff; led architecture and purchase of Microsoft 2008 Server products, and supporting hardware.
  • Based on CIO communication, COO of MEDC and Governor of Michigan used and benefitted from faster information access to new, accurate and timely systems.

Confidential

Senior Data Architect / Data Modeler / Lead Information Architect

Responsibilities:

  • Due to BBDO Separation, data modeled and created an abstracted data services layer in ERwin 7.3.3, UDB 9.5, 9.7, DbVisualizer, Talend Open Studio, Talend Open Profiler, MYSQL; created Canadian parts program and international incentives program data structures, databases and ETL for new system; built glossary.
  • Met 90-day multiple platform delivery target.

Confidential

Senior Data Architect / Data Modeler / Tableau Information Visualization

Responsibilities:

  • Created new data models for the new BI analytics in ERwin. Analyzed, designed, built cubes with multiple perspectives (views) for vehicle sales North America. Led Analysis Services Migration.
  • Established Best Practices for SQL Server 2008, Integration Services 2008, and Analysis Services 2008.
  • Developed SSIS packages for EDW and cube building for Crossovers for all industry segments and vehicles.
  • Working in consulting role to advise leader in JD Power PIN MicroStrategy data acquisition and aggregation functions.
  • Wrote 277 T-SQL DBA queries.
  • Wrote MDX queries in support of the cube perspectives.
  • Used Tableau Data Visualization to analyze SSAS cube data quality.
  • Provided SAS 9 product leadership to team.

Confidential

Senior Data Architect / Data Modeler / Web Analytics

Responsibilities:

  • Created over forty data models based on vendor’s web-based eQuoting, eRenewal, Group Wide Change, and eEnrollment; engineered Web-based BI solutions for the EDW and for Cognos 8 Framework Manager; built Metadata Bridges for passing metadata and built new data models from scratch in ERwin for Cognos 8 and Informatica Power Center from ER/Studio; data profiled all data impacted with Informatica Discovery Explorer

Confidential

Senior Web Systems Architect / Business Process Modeler

Responsibilities:

  • Led relationship to partner with Endeca Linguistic Search Engine. Worked with a wide array of product chief inventors, master scientists, and distinguished engineers to deliver timely world class solutions to major commercial, government and educational clients.
  • First to present IBM Global Business Services Lean Data Governance Model to a major healthcare client.
  • Invented billion+ row cross-platform solutions for the same healthcare client.
  • For the same client, worked with IBM Massachusetts Research Lab(Waltham) to set up a perfect data integration software environment on three separate compilers, which I partnered with to find a root cause of alpha characters in a financial number only field—it was not the software and the compilers—it was a data quality issue, which IBM resolved with significant resources.
  • Anticipated a major software outage of several thousand products worldwide by several months earlier in three days elevating the manager to have Passport Advantage authority from the healthcare giant’s purchasing department, which the sales executive had worked on for six months. On my flight day Friday, all outsourced personnel products failed with a sixty-day free license.
  • While the manager was pressed upon to send out the production licenses in a hurry, I held up the process, made his issue a severity ONE with IBM, and within an hour, he had the right software and PATCHES, to bring up all data integration worldwide. For a large client that purchased a data center in the Southwest, invented and had their technical resource implement upon my direction, a major change to the HP/UX file system to successfully implement and run a large expenditure data integration product, where the internal expertise was primarily Sun/Oracle. Invented software and trained my mentor in mainframe to Teradata connectivity.
  • Wrote a white paper, invented software, and trained a UIX/AIX system administrator in PAM, the pluggable authentication module, from SUN, for connecting the IBM data integration product to a giant food retailer’s LDAP. Hardware evaluation and data integration strategy for SAS.
  • Worked with Lee Scheffler on innovating from his product solutions to my hardware vendor innovations. First to connect mainframe to Teradata at Confidential .
  • Created PAM LDAP white paper based on my innovations at Confidential
  • Trained an AIX admin in PAM, pluggable authentication module from Sun Microsystems.

We'd love your feedback!