Computational biology and Genomics

My research interest concerns in the development of statistical methods and models for computational biology. Currently, my research  projects are mainly focus on following topics:

  • Evolutionary approaches to tumorigenesis
  • Development of prognostic and predictive markers and signatures
  • Novel computational approaches to integrative genomics data analysis

继续阅读“Computational biology and Genomics”

A new fast method for copy number calling, tissue purity estimating and subclone inferring in cancer genome

Our new methods final launched on Nature Protocols, where we developed a series of methods and related C++/R combined software package,  Sclust(around 1.5Gb,大文件谨慎载). In Sclust, you can do copy number calling, cancer tissue purity estimating and clone and subclone structure inferring from normal-tumor paired whole genome/exon sequencing data.


1. 可以准确地做copy number calling, tumor purity estimating,subclonal inferring;

2. subclonal inferring的速度超级快。4000~6000 个SNVs 的 clonal inferring 过程在个人电脑上只需3到5秒。

3. sclust 给出了每个集群的倍数树变异,目前还有少数个软件提供这个功能。


联系邮件:yp.cun@outlook.com。 下面clonal 推断一些背景。

继续阅读“A new fast method for copy number calling, tissue purity estimating and subclone inferring in cancer genome”





培训课程内容: 深入讲解编程的基础思路和R语言的思想,并有R编程和数据处理的多上机实践和答疑!学习多个使用R语言分析的实例,包括基本的数据统计、基因芯片GEO数据分析以及TCGA数据下载和分析。


A useful course of biomedical data analysis

Biomedical Data Science: http://genomicsclass.github.io/book/

Chapter 0 – Introduction

Chapter 1 – Inference

Chapter 2 – Exploratory Data Analysis

Chapter 3 – Robust Statistics

Chapter 4 – Matrix Algebra

Chapter 5 – Linear Models

Chapter 6 – Inference for High-Dimensional Data

Chapter 7 – Statistical Modeling

Chapter 8 – Distance and Dimension Reduction

Chapter 9 – Practical Machine Learning

Chapter 10 – Batch Effects

525.5x: Introduction to Bioconductor: Annotation and analysis

Setup and basics on biological background (Week 1)

Focus on data structure and management (Week 2)

Focus on genomic ranges (Week 3a)

Focus on genomic annotation (Week 3b)

Testing genome-scale hypotheses (Week 4)

525.6x: High-performance computing for reproducible genomics with Bioconductor

Visualization of genome scale data (Week 1)

Scalable genomic analysis (Week 2)

Multi-omic data integration (Week 3)

Fostering reproducible genome-scale analysis (Week 4)

Legacy material from 2015 Introduction to Bioconductor

RNA-seq data analysis

Variant Discovery and Genotyping

ChIP-seq data analysis

DNA methylation data analysis

Footnotes for all lectures


继续阅读“A useful course of biomedical data analysis”

【c】Frontiers in Single Cell Genomics, Suzhou

Frontiers in Single Cell Genomics



We are pleased to announce the Cold Spring Harbor Asia conference on Frontiers in Single Cell Genomics which will be held in Suzhou, China, located approximately 60 miles west of Shanghai. The conference will begin at 7:00pm on the evening of Monday November 7, and will conclude after lunch on November 11, 2016.

继续阅读“【c】Frontiers in Single Cell Genomics, Suzhou”

A brief introduction to “apply” in R

a good, practical guidline for “apply” in R.

What You're Doing Is Rather Desperate

At any R Q&A site, you’ll frequently see an exchange like this one:

Q: How can I use a loop to […insert task here…] ?
A: Don’t. Use one of the apply functions.

So, what are these wondrous apply functions and how do they work? I think the best way to figure out anything in R is to learn by experimentation, using embarrassingly trivial data and functions.

If you fire up your R console, type “??apply” and scroll down to the functions in the base package, you’ll see something like this:

Let’s examine each of those.

1. apply
Description: “Returns a vector or array or list of values obtained by applying a function to margins of an array or matrix.”

OK – we know about vectors/arrays and functions, but what are these “margins”? Simple: either the rows (1), the columns (2) or both (1:2). By “both”, we mean “apply the…

View original post 1,003 more words

Fancy a challenge? A DREAM of intra-tumour phylogenies

a big chance for dry labs

Scientific B-sides

You are into tumor evolution? And got a fancy model? Want to battle with the best?

Then check out the ICGC-TCGA DREAM Somatic Mutation Calling – Tumour Heterogeneity Challenge (SMC-Het).

These are the days of Big Science, my friend. You can’t just have a short name …

View original post 286 more words