DocumentCode :
79847
Title :
Integrative Data Analysis of Multi-Platform Cancer Data with a Multimodal Deep Learning Approach
Author :
Muxuan Liang ; Zhizhong Li ; Ting Chen ; Jianyang Zeng
Author_Institution :
Dept. of Math. Sci., Tsinghua Univ., Beijing, China
Volume :
12
Issue :
4
fYear :
2015
fDate :
July-Aug. 1 2015
Firstpage :
928
Lastpage :
937
Abstract :
Identification of cancer subtypes plays an important role in revealing useful insights into disease pathogenesis and advancing personalized therapy. The recent development of high-throughput sequencing technologies has enabled the rapid collection of multi-platform genomic data (e.g., gene expression, miRNA expression, and DNA methylation) for the same set of tumor samples. Although numerous integrative clustering approaches have been developed to analyze cancer data, few of them are particularly designed to exploit both deep intrinsic statistical properties of each input modality and complex cross-modality correlations among multi-platform input data. In this paper, we propose a new machine learning model, called multimodal deep belief network (DBN), to cluster cancer patients from multi-platform observation data. In our integrative clustering framework, relationships among inherent features of each single modality are first encoded into multiple layers of hidden variables, and then a joint latent model is employed to fuse common features derived from multiple input modalities. A practical learning algorithm, called contrastive divergence (CD), is applied to infer the parameters of our multimodal DBN model in an unsupervised manner. Tests on two available cancer datasets show that our integrative data analysis approach can effectively extract a unified representation of latent features to capture both intra- and cross-modality correlations, and identify meaningful disease subtypes from multi-platform cancer data. In addition, our approach can identify key genes and miRNAs that may play distinct roles in the pathogenesis of different cancer subtypes. Among those key miRNAs, we found that the expression level of miR-29a is highly correlated with survival time in ovarian cancer patients. These results indicate that our multimodal DBN based data analysis approach may have practical applications in cancer pathogenesis studies and provide useful guidelines for personali- ed cancer therapy.
Keywords :
RNA; belief networks; cancer; data analysis; feature extraction; genetics; genomics; medical computing; molecular biophysics; pattern clustering; tumours; unsupervised learning; DNA methylation; advancing personalized therapy; cancer data analysis; cancer pathogenesis; cancer patient clustering; cancer subtype identification; complex cross-modality correlations; contrastive divergence; cross-modality correlations; disease pathogenesis; gene expression; high-throughput sequencing technologies; input modality; integrative clustering approaches; integrative data analysis; integrative data analysis approach; intramodality correlations; intrinsic statistical properties; joint latent model; key genes; latent feature extraction; machine learning model; miR-29a; miRNA expression; multimodal DBN based data analysis; multimodal DBN model; multimodal deep belief network; multimodal deep learning approach; multiplatform cancer data; multiplatform genomic data; multiple input modalities; ovarian cancer patients; personalized cancer therapy; practical learning algorithm; tumor samples; unsupervised manner; Bioinformatics; Cancer; Computational biology; DNA; Data analysis; Data models; Genomics; Multi-platform cancer data analysis; clinical data; genomic data; identification of cancer subtypes; multimodal deep belief network; restricted Boltzmann machine;
fLanguage :
English
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
1545-5963
Type :
jour
DOI :
10.1109/TCBB.2014.2377729
Filename :
6977954
Link To Document :
بازگشت