Research

Our major research interest concerns the development of statistical and machine learning methods and softwares for computational biology. I am particularly interested in complex biological data modeling, such as high-throughput genomic data, network biology, population genomics and related evolutionary problems. Currently, my research projects are mainly focus on following topics:
  • computational genomics
  • network biology
  • population genomics
  • machine learning

我们的研究主要是基于机器学习和统计学理论来建立计算生物学模型。现在我们的工作主要围绕iFlora计划(智能植物志)开发和优化大数据模型用于构建智能植物志中物种鉴定算法,主要的研究方向如下:

  • 开发和优化针对植物基因组的高质量基因组组装技术;
  • 发展新的算法用于发现基因中的突变、重组、基因拷贝数变异等遗传变异;
  • 发展新的模型用于RNAseq、表观遗传学、chi-seq、单分子测序数据的分析;
  • 构建机器学习模型对上万种植物进行分类鉴定;
  • 大规模基因组计算的云平台构建。

Softwares

  • Sclust:  A C++/R package for inference of subclonal populations in cancer genomes using smoothing splines. (If your download locate in China, the download speed of this site is much more faster: rj.run/downloads/Sclust.tgz for china mainland )
  • FrSVM:  an R program, short of Filter by highly ranked gene for Support Vector Machine, for microarray classification.
  • netClass: An R package for network-Based microarray Classification