Developer Resume
US
Education:
- Computer Science (Bioinformatics)
- BS, Software Engineering
Skills:
- OS: Windows, Linux
- Database: SQL server
- Programming Languages: R, C++, C, C#, Java, Python and SQL
Work Experience:
- Sep 2011 – Confidential,Chapel Hil
Awards:
- NSF: 2010 ICDM Student Award
- SIAM: 2010 SDM Doctor Forum Fellowship
Selected projects:
Confidential
Software Environment: Suse 11
Developing Tool: R, C++
Responsibility: Design, implement test and writing
Software Description: We are trying to uncover the relationship between transcription factor binding sites and disease related SNP. Our hypothesis is that transcription factors are likely to bind near disease related SNP. We use various new generation sequencing techniques including ChIP-Seq, DNase 1-Seq and FAIRE-Seq to locate transcription factors’ binding position. A new statistical package and scoring model are designed and tested against public available GWAS data.
Confidential
Software Environment: Suse 10.3
Developing Tool: R, C++
Responsibility: Design, implement test and writing
Software Description: The software summarizes ChIP-Seq peaks within up and downstream of genes into one binding score or probability in terms of distances between ChIP-Seq peaks and Transcription Start Site and ChIP-Seq peaks overlapping with the genome sequence. The problem the software addressed is to estimate binding strength by considering all ChIP-Seq peaks in up- and downstream of each gene.
Confidential
Software Environment: Suse 10.3
Developing Tool: C++
Responsibility: Design, implement, test and writing
Software Description: The software discovers sub-matrices from a large data table such that the cell value in the selected rows and columns follows low-variance distribution and the size of the sub-matrix is as large as possible. This software can be used to search customer groups whose consumption behaviors are similar to each other.
Confidential
Software Environment: Suse 10.3
Developing Tool: Java, Eclipse
Responsibility: Design, implement, test and writing
Software Description: The software uses multi-thread techniques to search motifs defined by TRANSFAC matching in the whole genome. Key responsibilities include identifying the promoter regions of each gene, searching motifs which have been conserved, and calculating binding strength.
Selected Publications:
- Zhen Hu and Raj Bhatnagar, Mining low-variance biclusters to discover coregulation modules in sequencing datasets, Scientific Programming, vol 20. no. 1, pp. 15-27,2012
- Zhen Hu and Raj Bhatnagar, Clustering algorithm based on Mutual K-Nearest Neighbor Relationships, Statistical Analysis and Data Mining, vol. 5, issue 2, pp. 100-113, 2012
- Fabisiak JP, Medvedovic M, Alexander DC, McDunn JE, Concel VJ, Bein K, Jang AS, Berndt A, Vuga LJ, Brant KA, Pope-Varsalona H, Dopico RA Jr, Ganguly K, Upadhyay S, Li Q, Hu Z, Kaminski N and Leikauf GD, Integrativemetabolomeandtranscriptomeprofilingrevealsdiscordantenergeticstressbetweenmousestrainswithdifferentialsensitivitytoacrolein-inducedacutelunginjury,Mol Nutr Food Res., vol. 55, issue 9, pp. 1423-1434, 2011
- Zhen Hu and Raj Bhatnagar, DiscoveryofVersatileTemporalSubspacePatternsin3-DDatasets, Proceeding of 11th IEEE International Conference on Data mining (ICDM), 2011
- Zhen Hu and Raj Bhatnagar, Low-VarianceBiclusterstoIdentifyCoregulationModulesinSequencingDatasets, BIOKDD 2011.
- Shinde K, Phatak M, Freudenberg JM, Chen J, Li Q, Joshi VK, Hu Z, Ghosh K, Meller J, and Medvedovic M, GenomicsPortals:IntegrativeWeb-PlatformforMiningGenomicsData. BMC Genomics. Jan 13;11(1):27. 2010
- Zhen Hu, Siva Sivaganesan, and Mario Medvedovic, PriorInformationBasedBayesianInfiniteMixtureModel, 11th International Symposium on Artificial Intelligence and Mathematics (ISIAM), 2010
- Zhen Hu and Raj Bhatnagar, AlgorithmforDiscoveringLow-Variance3-ClustersFromReal-ValuedDatasets, Proceeding of 10th IEEE International Conference on Data mining (ICDM), 2010
- Freudenberg JM, Joshi VK, Hu Z and Medvedovic M: CLEAN:CLusteringEnrichmentANalysis. BMC Bioinformatics 10:234. 2009.