Sclust paper published on NP

After years fighting, our Sclsut paper published on Nature Protocols finally. Enjoy!

Copy-number analysis and inference of subclonal populations in cancer genomes using Sclust

  • Nature Protocols volume13pages1488–1501 (2018)
  • doi:10.1038/nprot.2018.033
Published: 24 May 2018

Abstract

The genomes of cancer cells constantly change during pathogenesis. This evolutionary process can lead to the emergence of drug-resistant mutations in subclonal populations, which can hinder therapeutic intervention in patients. Data derived from massively parallel sequencing can be used to infer these subclonal populations using tumor-specific point mutations. The accurate determination of copy-number changes and tumor impurity is necessary to reliably infer subclonal populations by mutational clustering. This protocol describes how to use Sclust, a copy-number analysis method with a recently developed mutational clustering approach. In a series of simulations and comparisons with alternative methods, we have previously shown that Sclust accurately determines copy-number states and subclonal populations. Performance tests show that the method is computationally efficient, with copy-number analysis and mutational clustering taking <10 min. Sclust is designed such that even non-experts in computational biology or bioinformatics with basic knowledge of the Linux/Unix command-line syntax should be able to carry out analyses of subclonal populations.

A new fast method for copy number calling, tissue purity estimating and subclone inferring in cancer genome

Our new methods final launched on Nature Protocols, where we developed a series of methods and related C++/R combined software package,  Sclust(around 1.5Gb,大文件谨慎载). In Sclust, you can do copy number calling, cancer tissue purity estimating and clone and subclone structure inferring from normal-tumor paired whole genome/exon sequencing data.

先总结一下,我们方法的有如下点:

1. 可以准确地做copy number calling, tumor purity estimating,subclonal inferring;

2. subclonal inferring的速度超级快。4000~6000 个SNVs 的 clonal inferring 过程在个人电脑上只需3到5秒。

3. sclust 给出了每个集群的倍数树变异,目前还有少数个软件提供这个功能。

欢迎使用软件,欢迎咨询,欢迎交流。

联系邮件:yp.cun@outlook.com。 下面clonal 推断一些背景。

继续阅读“A new fast method for copy number calling, tissue purity estimating and subclone inferring in cancer genome”

《R语言在生物医学数据处理中的应用》第一期

R语言生物医学数据处理中的应用

2017.11.04-11.05第一期

昆明

培训课程内容: 深入讲解编程的基础思路和R语言的思想,并有R编程和数据处理的多上机实践和答疑!学习多个使用R语言分析的实例,包括基本的数据统计、基因芯片GEO数据分析以及TCGA数据下载和分析。

 

A useful course of biomedical data analysis

Biomedical Data Science: http://genomicsclass.github.io/book/

Chapter 0 – Introduction

Chapter 1 – Inference

Chapter 2 – Exploratory Data Analysis

Chapter 3 – Robust Statistics

Chapter 4 – Matrix Algebra

Chapter 5 – Linear Models

Chapter 6 – Inference for High-Dimensional Data

Chapter 7 – Statistical Modeling

Chapter 8 – Distance and Dimension Reduction

Chapter 9 – Practical Machine Learning

Chapter 10 – Batch Effects


525.5x: Introduction to Bioconductor: Annotation and analysis

Setup and basics on biological background (Week 1)

Focus on data structure and management (Week 2)

Focus on genomic ranges (Week 3a)

Focus on genomic annotation (Week 3b)

Testing genome-scale hypotheses (Week 4)

525.6x: High-performance computing for reproducible genomics with Bioconductor

Visualization of genome scale data (Week 1)

Scalable genomic analysis (Week 2)

Multi-omic data integration (Week 3)

Fostering reproducible genome-scale analysis (Week 4)


Legacy material from 2015 Introduction to Bioconductor

RNA-seq data analysis

Variant Discovery and Genotyping

ChIP-seq data analysis

DNA methylation data analysis


Footnotes for all lectures

Acknowledgments

继续阅读“A useful course of biomedical data analysis”

【c】Frontiers in Single Cell Genomics, Suzhou

Frontiers in Single Cell Genomics

http://www.csh-asia.org/2016meetings/cell.html

 

We are pleased to announce the Cold Spring Harbor Asia conference on Frontiers in Single Cell Genomics which will be held in Suzhou, China, located approximately 60 miles west of Shanghai. The conference will begin at 7:00pm on the evening of Monday November 7, and will conclude after lunch on November 11, 2016.

继续阅读“【c】Frontiers in Single Cell Genomics, Suzhou”

A brief introduction to “apply” in R

a good, practical guidline for “apply” in R.

What You're Doing Is Rather Desperate

At any R Q&A site, you’ll frequently see an exchange like this one:

Q: How can I use a loop to […insert task here…] ?
A: Don’t. Use one of the apply functions.

So, what are these wondrous apply functions and how do they work? I think the best way to figure out anything in R is to learn by experimentation, using embarrassingly trivial data and functions.

If you fire up your R console, type “??apply” and scroll down to the functions in the base package, you’ll see something like this:

Let’s examine each of those.

1. apply
Description: “Returns a vector or array or list of values obtained by applying a function to margins of an array or matrix.”

OK – we know about vectors/arrays and functions, but what are these “margins”? Simple: either the rows (1), the columns (2) or both (1:2). By “both”, we mean “apply the…

View original post 1,003 more words