DocumentCode
3601353
Title
A Resampling Based Clustering Algorithm for Replicated Gene Expression Data
Author
Han Li ; Chun Li ; Xiaodan Fan
Author_Institution
Dept. of Stat., Shenzhen Univ., Shenzhen, China
Volume
12
Issue
6
fYear
2015
Firstpage
1295
Lastpage
1303
Abstract
In gene expression data analysis, clustering is a fruitful exploratory technique to reveal the underlying molecular mechanism by identifying groups of co-expressed genes. To reduce the noise, usually multiple experimental replicates are performed. An integrative analysis of the full replicate data, instead of reducing the data to the mean profile, carries the promise of yielding more precise and robust clusters. In this paper, we propose a novel resampling based clustering algorithm for genes with replicated expression measurements. Assuming those replicates are exchangeable, we formulate the problem in the bootstrap framework, and aim to infer the consensus clustering based on the bootstrap samples of replicates. In our approach, we adopt the mixed effect model to accommodate the heterogeneous variances and implement a quasi-MCMC algorithm to conduct statistical inference. Experiments demonstrate that by taking advantage of the full replicate data, our algorithm produces more reliable clusters and has robust performance in diverse scenarios, especially when the data is subject to multiple sources of variance.
Keywords
bootstrapping; genetics; inference mechanisms; statistical analysis; bootstrap samples; coexpressed genes; integrative analysis; molecular mechanism; noise reduction; quasiMCMC algorithm; replicated gene expression data; resampling based clustering algorithm; statistical inference; Algorithm design and analysis; Bioinformatics; Clustering algorithms; Computational biology; Data models; Genomics; Gene clustering; gene clustering; integrative analysis; mixed effect model; replicated microarray data;
fLanguage
English
Journal_Title
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher
ieee
ISSN
1545-5963
Type
jour
DOI
10.1109/TCBB.2015.2403320
Filename
7042338
Link To Document