We provide IT Staff Augmentation Services!

Software/data Engineer Resume

4.00/5 (Submit Your Rating)


  • 18 - year software development and data analytics experience.
  • Have strong background in software development, data engineering, statistics and machine learning.
  • Want to find a job that is related to software development, data analytics and machine learning.


Big data platforms: Hadoop/YRAN application, PIG/HIVE, MapReduce, Spark, CDAP, Cloudera, Hortonworks, Kafka, Storm

NoSQL: HBase, Cassendra, Redis, ElasticSearch, Solr

Data Engineering: distributed data processing pipeline, data warehouse, SQL/NoSQL database, Data query, Data visualization Machine learning and feature engineering

Java/JEE application development with: JSP/JSF/Spring/Jersey/RESTful API

Relational database application development with: Oracle, MySQL, PostgreSQL

Grid computing (SGE and LSF) and parallel programing: SUN Grid Engine (SGE, now called OGE, Oracle Grid Engine)

Software development with: C++/Java/Scala/Perl/R

Web UI development with: HTML/XML/Javascript/Ajax



Software/Data Engineer


  • Design and develop Big data processing (Java, Kafka, ElasticSearch, Redis, Spark, Scala, HDFS)
  • Applied CDAP big data platform to data processing, feature extraction and model training (CDAP apps such as Stream, Flowlets DataSets (Tables and FileSets), REST Services, MapReduce programs, Spark programs)
  • Develop machine learning recommendation system for Volvo autonomous vehicles


Staff Software Developer


  • Big data processing (Kafka, Storm, Spark, Cassandra, MapReduce)
  • Machine learning application development for cybercrime/fraud detection (supervised learning, classification)
  • Machine learning feature engineering:
  • Develop web UI to automatically create machine learning variable (feature) specifications according to the combinations of variable attributes (pivots, functions, subjects, transaction event types, scopes, time windows and filters).
  • Develop and maintain Variable Generation Framework (VGF) that consumes transaction events and computes variable (feature) values according to variable specification.
  • Create, select and test variables (features) to improve machine performance.


Backend software developer


  • Development of large-scale mobile backend and data analytics system using Java/JEE/Hadoop
  • RESTful API: Jersey/Glashfish/Grizzly
  • Authentication and Authorization: OAuth2
  • Cloud data storage, exchange and synchronization: Amazon S3, Google Drive API, Dropbox API, Facebook API
  • Messaging system: Kafka
  • NoSQL database: Hbase
  • Relational database: MySQL
  • Central caching: Redis
  • Full-text searching: Solr
  • Big data analysis with R, MapReduce, YARN/PIG/HIVE


Software developer 4/Data Analytics/Group Lead


  • Java/J2EE, JSF, Hibernate, Perl, MySql, Continuous Integration/Hudson
  • DOE NESC HPC massively parallel computing (High-performance computing/Sun Grid Engine),
  • Take technical leadership in developing DNA sequence quality control & analysis software system (Rolling QC). Rolling QC software system automatically picks up the sequences that are generated from sequencing machines, runs quality score metrics, contaminant/artifact detection, sequence redundancy analysis, draft assembly, alignments to variety of genome/transcriptome references and MEGAN analysis. The reports of Rolling QC not only give feedbacks to DNA sampling and library creations but also provide guidelines to future sequence assemblies and functional discoveries.
  • Take technical leadership in designing and developing draft sequence assembly system (QC draft) for DNA sequences from single cells and isolates. The DNA sequences are automatically passed into assembly pipeline that is composed of read normalization and a Velvet/Allpaths-based assembler. The assembly results (contigs) are aligned to different sequence references and the alignment results are analyzed by taxonomic classification. Finally, the assembly draft is automatically transferred to Integrated Microbial Genomes (IMG) for sequence annotations and functional discoveries


Startup Entrepreneur/Software Developer/Data Engineer


  • Java/J2EE, JSF, Hibernate, JFreechart, Lucene, JDBC, MySQL, R
  • Designed and developed web application for large-scale and cross-experimental gene expression data integration and visualization system (Gene Expression Browser, refer to.
  • Designed and developed ETL (Extract, Transform and Loading) pipeline tools for extract and load new data into the web application.
  • Designed experiment normalization strategy to normalize gene expression experiments with diverse designs into Treatment over Control (T/C). As a result, the data from different experiments can be transformed into uniform format and thus can be integrated together. The experiment normalization strategy was applied to Gene Expression Browser software system to normalize 436 microarray experiments from NCBI GEO into 2,007 Treatment over controls (T/Cs), in which 45,460,330 pairs of gene expression ratio data are available for searching and displaying.
  • Released Arabidopsis Gene Expression Browser (GEB) as free online bioinformatics tool at. GEB has provided Arabidopsis gene expression information search & visualization services for thousands of scientists from over 30 countries.


Senior Software Developer and Data Engineer


  • Java/J2EE, JSF, JFreechart, Hibernate, JDBC, Oracle, R
  • Worked on developing new clinical diagnostic assay of cardiovascular diseases using multiple protein markers.
  • Screened and validated biomarkers for assay of cardiovascular diseases with expression data of mRNA microarrays and protein chips using in silico analysis.
  • Designed and developed data management and mining software system to integrate protein biomarker intensity data of Luminex assay with patient’s clinical data, such as age, gender, race, cholesterol level, disease history, medical history, diet, smoking/alcohol status etc. The system strongly helped to find the relationships between biomarkers and cardiovascular diseases.


Computational Scientist/Data Analytics


  • Java/J2EE, JSF, Hibernate, JDBC, Web service, R, Oracle
  • Lead the development of Standard Operating Procedure (SOP) of Transcriptome eXpression Profiling (TxP) in Confidential Inc.
  • Lead gene expression profiling analysis on Confidential traits that have the features of drought resistance, high yield, tolerance to nutrition starvation.
  • Develop software system to integrate the data from gene expression profiling, high throughput cloning and trait phenotyping. The system efficiently supports plant genetic engineering and accelerates the screenings of traits according to target characteristics.


Senior Software Developer/Data Analytics


  • Java/J2EE, WebSphere, Oracle
  • Develop software to integrate maize genetic map with physical map using hybridization-based markers (overgos) that exist in both maize genetic markers and maize contig sequences.
  • Developed software application for gene architecture analysis and annotation. The software application accepted a piece of genomic DNA sequence as input. It at first blastn against NR sequence database, then blastx against Swiss-Prot protein database, and finally did PFAM protein family domain searching. The software application integrated the results from above analyses and outputs the promoter, introns, exons, protein domains, NR annotation and Swiss-Prot annotation of the input sequence.


Software Developer


  • C++/Java, JSP/Servlet, Oracle/SqlServer/Sybase
  • Livelink developer
  • Document management system
  • Full-text search engine

We'd love your feedback!