COSINE使用指南_0.1

一、概述

肿瘤DNA序列的亚克隆重建已成为肿瘤进化研究中的重要组成部分,为研究变异与突变过程的相对顺序提供了新的思路。在以往的研究中,多数研究是通过复现前人研究结果,在此基础上运行新的数据集,以验证分析软件的泛化能力,精确度和灵敏度,等等。然而,近年来逐渐增加的肿瘤进化分析软件,也为医生和研究人员的选择带来困难。如何选择最有效的一个或几个分析软件,如何评价这些软件的优劣,成为亟待解决的热门问题。为评估不断增长的肿瘤进化分析软件、给业内人士提供可靠的、清晰的数据分析信息,我们扩展了前人的工作,将12个肿瘤进化分析软件集成在同一平台内,并为12种软件提供了相应的输入数据的生成方式。使用者可以通过原始数据,即fastq文件,生成12种分析软件所需要的输入文件。此项工作旨在保证软件评估中数据来源相同,评价标准相同。在软件使用中,使用者可以通过参考12种软件的输出结果,对同一份样本测序文件进行更为全面的分析。

Continue reading “COSINE使用指南_0.1”

COSINE web server launched at www.clab-cosine.net

COSINE: A Web Server for Clonal and Subclonal Structure Inferencing and Evolution in Cancer Genomics. COSINE is freely accessible at http://www.clab-cosine.net or http://bio.rj.run:48996/cun-web/.  

我们在COSINE中对12种subclonal推断算法实现了界面化,用户可直接登录网址计算。欢迎测试和反馈使用体验。COSINE中的12种方法是:

Sclust paper published on NP

After years fighting, our Sclsut paper published on Nature Protocols finally. Enjoy!

Yupeng Cun, Tsun-Po Yang, Viktor Achter*, Ulrich Lang, Martin Peifer, Copy number analysis and inference of subclonal populations in cancer genomes using Sclust. Nature Protocols, 2018,DOI: 10.1038/nprot.2018.033
Sclust download link: rj.run/downloads/Sclust.tgz)


Frequent Q&A on Sclust software package uses:

A new fast method for copy number calling, tissue purity estimating and subclone inferring in cancer genome

Our new methods final launched on Nature Protocols, where we developed a series of methods and related C++/R combined software package,  Sclust(around 1.5Gb,大文件谨慎载). In Sclust, you can do copy number calling, cancer tissue purity estimating and clone and subclone structure inferring from normal-tumor paired whole genome/exon sequencing data.

先总结一下,我们方法的有如下点:

1. 可以准确地做copy number calling, tumor purity estimating,subclonal inferring;

2. subclonal inferring的速度超级快。4000~6000 个SNVs 的 clonal inferring 过程在个人电脑上只需3到5秒。

3. sclust 给出了每个集群的倍数树变异,目前只有少数个软件提供这个功能。

欢迎使用软件,欢迎咨询,欢迎交流。

联系邮件:yp.cun@outlook.com。 下面是clonal 推断一些背景。

Continue reading “A new fast method for copy number calling, tissue purity estimating and subclone inferring in cancer genome”

A new R package for network-based biomarker discovery released

A new R package, netClass, has been release. netClass integrate network information, such as protein-protein interaction network or KEGG, to mRNA classification, but also incorporate miRNA to mRNA with mi-mRNA interaction network for biomarker discovery. This methods we called stSVM and already published in PloS ONE (Cun et al 2013). Apart from stSVM, we also implement the flowing methods in netClass: 

  1. AEP (average gene expression of pathway), Guo et al., BMC Bioinformatics 2005, 6:58.
  2. PAC (pathway activitive classification), Lee E, et  al., PLoS Comput Biol 4(11): e1000217.
  3. hubc (Hub nodes classification), Taylor et al.(2009) Nat. Biotech.: doi: 10.1038/nbt.152
  4. frSVM (filter via top ranked genes), Cun et al. arXiv:1212.3214 ;  Winter etal., PLoS Comput Biol 8(5): e1002511.
  5. stSVM (network smoothed t-statistic) , Cun et al., PloS One,.

NetClass can be download from souceforg ( http://sourceforge.net/projects/netclassr/) or , CRAN (http://cran.r-project.org/web/packages/netClass/ ). For more detail of netClass, you can refer these four papers:

“Translational Bioinformatics” collection for PLOS cBio

A review collection in current approach in Translational Bioinformatics.

=======================================================

COVER
Image Credit: PLOS
Issue Image

‘Translational Bioinformatics’ is a collection of PLOS Computational Biology Education articles which reads as a “book” to be used as a reference or tutorial for a graduate level introductory course on the science of translational bioinformatics.

Translational bioinformatics is an emerging field that addresses the current challenges of integrating increasingly voluminous amounts of molecular and clinical data. Its aim is to provide a better understanding of the molecular basis of disease, which in turn will inform clinical practice and ultimately improve human health.

The concept of a translational bioinformatics introductory book was originally conceived in 2009 by Jake Chen and Maricel Kann. Each chapter was crafted by leading experts who provide a solid introduction to the topics covered, complete with training exercises and answers. The rapid evolution of this field is expected to lead to updates and new chapters that will be incorporated into this collection.

Collection editors: Maricel Kann, Guest Editor, and Fran Lewitter, PLOS Computational Biology Education Editor.

Download the full Translational Bioinformatics collection here: PDF

Collection URL: www.ploscollections.org/translationalbioinformatics

PLOS Computational Biology: published 27 Dec 2012 | info:doi/10.1371/journal.pcbi.1002796

PLOS Computational Biology: published 27 Dec 2012 | info:doi/10.1371/journal.pcbi.1002826

Chapter 2: Data-Driven View of Disease Biology

Casey S. Greene, Olga G. Troyanskaya

PLOS Computational Biology: published 27 Dec 2012 | info:doi/10.1371/journal.pcbi.1002816

PLOS Computational Biology: published 27 Dec 2012 | info:doi/10.1371/journal.pcbi.1002805

Chapter 4: Protein Interactions and Disease

Mileidy W. Gonzalez, Maricel G. Kann

PLOS Computational Biology: published 27 Dec 2012 | info:doi/10.1371/journal.pcbi.1002819

Chapter 5: Network Biology Approach to Complex Diseases

Dong-Yeon Cho, Yoo-Ah Kim, Teresa M. Przytycka

PLOS Computational Biology: published 27 Dec 2012 | info:doi/10.1371/journal.pcbi.1002820

PLOS Computational Biology: published 27 Dec 2012 | info:doi/10.1371/journal.pcbi.1002821

Chapter 7: Pharmacogenomics

Konrad J. Karczewski, Roxana Daneshjou, Russ B. Altman

PLOS Computational Biology: published 27 Dec 2012 | info:doi/10.1371/journal.pcbi.1002817

PLOS Computational Biology: published 27 Dec 2012 | info:doi/10.1371/journal.pcbi.1002858

Chapter 9: Analyses Using Disease Ontologies

Nigam H. Shah, Tyler Cole, Mark A. Musen

PLOS Computational Biology: published 27 Dec 2012 | info:doi/10.1371/journal.pcbi.1002827

Chapter 10: Mining Genome-Wide Genetic Markers

Xiang Zhang, Shunping Huang, Zhaojun Zhang, Wei Wang

PLOS Computational Biology: published 27 Dec 2012 | info:doi/10.1371/journal.pcbi.1002828

Chapter 11: Genome-Wide Association Studies

William S. Bush, Jason H. Moore

PLOS Computational Biology: published 27 Dec 2012 | info:doi/10.1371/journal.pcbi.1002822

Chapter 12: Human Microbiome Analysis

Xochitl C. Morgan, Curtis Huttenhower

PLOS Computational Biology: published 27 Dec 2012 | info:doi/10.1371/journal.pcbi.1002808

PLOS Computational Biology: published 27 Dec 2012 | info:doi/10.1371/journal.pcbi.1002823

Chapter 14: Cancer Genome Analysis

Miguel Vazquez, Victor de la Torre, Alfonso Valencia

PLOS Computational Biology: published 27 Dec 2012 | info:doi/10.1371/journal.pcbi.1002824

 

Use prior information to prognostic biomaker or not?

In our recent publication in BMC bioinformatics, we acompared a great deal of feature selection methods to finding prognostic biomakers in 6 breast cancer gene expresion data. No methods show significant performacne in prediction accuracy, feature selection stability and  biogical interprety, which against previeous reseach results: current network-based appraoch did not show much benift in our analysis. Meanwhile, A group from NKI also show the simliar results in PloS One. The R codes for these algorithms in our paper is availiable as request.

Prediction performance in terms of area under ROC curve (AUC)

Continue reading “Use prior information to prognostic biomaker or not?”

Current approach in finding biomaker by means of mahcine learning

How to find the robust biomarkers in the genomics data are first step to personalized medicine. Here we take a short review on how machine leaning works in find biomarkers and current aproach in this area.  for more interesting technology, please see the following papers.

Biomarker Gene Signature Discovery Integrating Network Knowledge

Bonn-Aachen International Center for IT (B-IT), Dahlmannstr. 2, 53113 Bonn, Germany
Abstract: Discovery of prognostic and diagnostic biomarker gene signatures for diseases, such as cancer, is seen as a major step towards a better personalized medicine. During the last decade various methods, mainly coming from the machine learning or statistical domain, have been proposed for that purpose. However, one important obstacle for making gene signatures a standard tool in clinical diagnosis is the typical low reproducibility of these signatures combined with the difficulty to achieve a clear biological interpretation. For that purpose in the last years there has been a growing interest in approaches that try to integrate information from molecular interaction networks. Here we review the current state of research in this field by giving an overview about so-far proposed approaches.

Social network, machine learning and disease-genes

Some recent paper on how disease gene network works and the metastasis of cancer. Machine  Learning is a good tool for study the relation between individual gene and disease.  here are the papers:

Infectious Disease Modeling of Social Contagion in Networks

Alison L. Hill1,2*, David G. Rand1,3, Martin A. Nowak1,4,5,Nicholas A. Christakis6,7,8

Information, trends, behaviors and even health states may spread between contacts in a social network, similar to disease transmission. However, a major difference is that as well as being spread infectiously, it is possible to acquire this state spontaneously. For example, you can gain knowledge of a particular piece of information either by being told about it, or by discovering it yourself. In this paper we introduce a mathematical modeling framework that allows us to compare the dynamics of these social contagions to traditional infectious diseases. We can also extract and compare the rates of spontaneous versus contagious acquisition of a behavior from longitudinal data and can use this to predict the implications for future prevalence and control strategies. As an example, we study the spread of obesity, and find that the current rate of becoming obese is about 2 per year and increases by 0.5 percentage points for each obese social contact, while the rate of recovering from obesity is 4per year. The rates of spontaneous infection and transmission have steadily increased over time since 1970, driving the increase in obesity prevalence. Our model thus provides a quantitative way to analyze the strength and implications of social contagions.

Continue reading “Social network, machine learning and disease-genes”