DocumentCode :
1056691
Title :
Error-Pooling Empirical Bayes Model for Enhanced Statistical Discovery of Differential Expression in Microarray Data
Author :
Cho, HyungJun ; Lee, Jae K.
Author_Institution :
Korea Univ., Seoul
Volume :
38
Issue :
2
fYear :
2008
fDate :
3/1/2008 12:00:00 AM
Firstpage :
425
Lastpage :
436
Abstract :
A number of statistical approaches have been proposed for evaluating the statistical significance of a differential expression in microarray data. The error estimation of these approaches is inaccurate when the number of replicated arrays is small. Consequently, their resulting statistics are often underpowered to detect important differential expression patterns in the microarray data with limited replication. In this paper, we propose an empirical Bayes (EB) heterogeneous error model (HEM) with error-pooling prior specifications for varying technical and biological errors in the microarray data. The error estimation of HEM is thus strengthened by and shrunk toward the EB priors that are obtained by the error-pooling estimation at each local intensity range. By using simulated and real data sets, we compared HEM with two widely used statistical approaches, significance analysis of microarray (SAM) and analysis of variance (ANOVA), to identify differential expression patterns across multiple conditions. The comparison showed that HEM is statistically more powerful than SAM and ANOVA, particularly when the sample size is smaller than five. We also suggest a resampling-based estimation of Bayesian false discovery rate to provide a biologically relevant cutoff criterion of HEM statistics.
Keywords :
Bayes methods; Markov processes; Monte Carlo methods; biology computing; data analysis; error statistics; genetics; pattern recognition; Bayesian false discovery; Markov chain; Monte Carlo method; biological errors; differential expression patterns; empirical Bayes heterogeneous error model; error estimation; error pooling estimation; gene expression data; heteroscedastic error; microarray data; replicated arrays; significance analysis; statistical discovery; technical errors; variance analysis; Bayesian false discovery rate (FDR); Markov chain Monte Carlo (MCMC); empirical Bayes (EB); gene expression data; heteroscedastic error;
fLanguage :
English
Journal_Title :
Systems, Man and Cybernetics, Part A: Systems and Humans, IEEE Transactions on
Publisher :
ieee
ISSN :
1083-4427
Type :
jour
DOI :
10.1109/TSMCA.2007.914761
Filename :
4445760
Link To Document :
بازگشت