We provide IT Staff Augmentation Services!

Software/data Engineer Resume

4.00/5 (Submit Your Rating)

SUMMARY:

  • 18 - year software development and data analytics experience.
  • Have strong background in software development, data engineering, statistics and machine learning.
  • Want to find a job that is related to software development, data analytics and machine learning.

TECHNICAL SKILLS:

Big data platforms: Hadoop/YRAN application, PIG/HIVE, MapReduce, Spark, CDAP, Cloudera, Hortonworks, Kafka, Storm

NoSQL: HBase, Cassendra, Redis, ElasticSearch, Solr

Data Engineering: distributed data processing pipeline, data warehouse, SQL/NoSQL database, Data query, Data visualization Machine learning and feature engineering

Java/JEE application development with: JSP/JSF/Spring/Jersey/RESTful API

Relational database application development with: Oracle, MySQL, PostgreSQL

Grid computing (SGE and LSF) and parallel programing: SUN Grid Engine (SGE, now called OGE, Oracle Grid Engine)

Software development with: C++/Java/Scala/Perl/R

Web UI development with: HTML/XML/Javascript/Ajax

PROFESSIONAL EXPERIENCE:

Confidential

Software/Data Engineer

Responsibilities:

  • Design and develop Big data processing (Java, Kafka, ElasticSearch, Redis, Spark, Scala, HDFS)
  • Applied CDAP big data platform to data processing, feature extraction and model training (CDAP apps such as Stream, Flowlets DataSets (Tables and FileSets), REST Services, MapReduce programs, Spark programs)
  • Develop machine learning recommendation system for Volvo autonomous vehicles

Confidential

Staff Software Developer

Responsibilities:

  • Big data processing (Kafka, Storm, Spark, Cassandra, MapReduce)
  • Machine learning application development for cybercrime/fraud detection (supervised learning, classification)
  • Machine learning feature engineering:
  • Develop web UI to automatically create machine learning variable (feature) specifications according to the combinations of variable attributes (pivots, functions, subjects, transaction event types, scopes, time windows and filters).
  • Develop and maintain Variable Generation Framework (VGF) that consumes transaction events and computes variable (feature) values according to variable specification.
  • Create, select and test variables (features) to improve machine performance.

Confidential

Backend software developer

Responsibilities:

  • Development of large-scale mobile backend and data analytics system using Java/JEE/Hadoop
  • RESTful API: Jersey/Glashfish/Grizzly
  • Authentication and Authorization: OAuth2
  • Cloud data storage, exchange and synchronization: Amazon S3, Google Drive API, Dropbox API, Facebook API
  • Messaging system: Kafka
  • NoSQL database: Hbase
  • Relational database: MySQL
  • Central caching: Redis
  • Full-text searching: Solr
  • Big data analysis with R, MapReduce, YARN/PIG/HIVE

Confidential

Software developer 4/Data Analytics/Group Lead

Responsibilities:

  • Java/J2EE, JSF, Hibernate, Perl, MySql, Continuous Integration/Hudson
  • DOE NESC HPC massively parallel computing (High-performance computing/Sun Grid Engine),
  • Take technical leadership in developing DNA sequence quality control & analysis software system (Rolling QC). Rolling QC software system automatically picks up the sequences that are generated from sequencing machines, runs quality score metrics, contaminant/artifact detection, sequence redundancy analysis, draft assembly, alignments to variety of genome/transcriptome references and MEGAN analysis. The reports of Rolling QC not only give feedbacks to DNA sampling and library creations but also provide guidelines to future sequence assemblies and functional discoveries.
  • Take technical leadership in designing and developing draft sequence assembly system (QC draft) for DNA sequences from single cells and isolates. The DNA sequences are automatically passed into assembly pipeline that is composed of read normalization and a Velvet/Allpaths-based assembler. The assembly results (contigs) are aligned to different sequence references and the alignment results are analyzed by taxonomic classification. Finally, the assembly draft is automatically transferred to Integrated Microbial Genomes (IMG) for sequence annotations and functional discoveries

Confidential

Startup Entrepreneur/Software Developer/Data Engineer

Responsibilities:

  • Java/J2EE, JSF, Hibernate, JFreechart, Lucene, JDBC, MySQL, R
  • Designed and developed web application for large-scale and cross-experimental gene expression data integration and visualization system (Gene Expression Browser, refer to.
  • Designed and developed ETL (Extract, Transform and Loading) pipeline tools for extract and load new data into the web application.
  • Designed experiment normalization strategy to normalize gene expression experiments with diverse designs into Treatment over Control (T/C). As a result, the data from different experiments can be transformed into uniform format and thus can be integrated together. The experiment normalization strategy was applied to Gene Expression Browser software system to normalize 436 microarray experiments from NCBI GEO into 2,007 Treatment over controls (T/Cs), in which 45,460,330 pairs of gene expression ratio data are available for searching and displaying.
  • Released Arabidopsis Gene Expression Browser (GEB) as free online bioinformatics tool at. GEB has provided Arabidopsis gene expression information search & visualization services for thousands of scientists from over 30 countries.

Confidential

Senior Software Developer and Data Engineer

Responsibilities:

  • Java/J2EE, JSF, JFreechart, Hibernate, JDBC, Oracle, R
  • Worked on developing new clinical diagnostic assay of cardiovascular diseases using multiple protein markers.
  • Screened and validated biomarkers for assay of cardiovascular diseases with expression data of mRNA microarrays and protein chips using in silico analysis.
  • Designed and developed data management and mining software system to integrate protein biomarker intensity data of Luminex assay with patient’s clinical data, such as age, gender, race, cholesterol level, disease history, medical history, diet, smoking/alcohol status etc. The system strongly helped to find the relationships between biomarkers and cardiovascular diseases.

Confidential

Computational Scientist/Data Analytics

Responsibilities:

  • Java/J2EE, JSF, Hibernate, JDBC, Web service, R, Oracle
  • Lead the development of Standard Operating Procedure (SOP) of Transcriptome eXpression Profiling (TxP) in Confidential Inc.
  • Lead gene expression profiling analysis on Confidential traits that have the features of drought resistance, high yield, tolerance to nutrition starvation.
  • Develop software system to integrate the data from gene expression profiling, high throughput cloning and trait phenotyping. The system efficiently supports plant genetic engineering and accelerates the screenings of traits according to target characteristics.

Confidential

Senior Software Developer/Data Analytics

Responsibilities:

  • Java/J2EE, WebSphere, Oracle
  • Develop software to integrate maize genetic map with physical map using hybridization-based markers (overgos) that exist in both maize genetic markers and maize contig sequences.
  • Developed software application for gene architecture analysis and annotation. The software application accepted a piece of genomic DNA sequence as input. It at first blastn against NR sequence database, then blastx against Swiss-Prot protein database, and finally did PFAM protein family domain searching. The software application integrated the results from above analyses and outputs the promoter, introns, exons, protein domains, NR annotation and Swiss-Prot annotation of the input sequence.

Confidential

Software Developer

Responsibilities:

  • C++/Java, JSP/Servlet, Oracle/SqlServer/Sybase
  • Livelink developer
  • Document management system
  • Full-text search engine

We'd love your feedback!