Title :
Clustering gene expression data using probabilistic non-negative matrix factorization
Author :
Bayar, Belhassen ; Bouaynaya, Nidhal ; Shterenberg, Roman
Author_Institution :
Dept. of Electr. Eng., Ecole Nat. d´´Ing. de Tunis, Tunis, Tunisia
Abstract :
Non-negative matrix factorization (NMF) has proven to be a useful decomposition for multivariate data. Specifically, NMF appears to have advantages over other clustering methods, such as hierarchical clustering, for identification of distinct molecular patterns in gene expression profiles. The NMF algorithm, however, is deterministic. In particular, it does not take into account the noisy nature of the measured genomic signals. In this paper, we extend the NMF algorithm to the probabilistic case, where the data is viewed as a stochastic process. We show that the probabilistic NMF can be viewed as a weighted regularized matrix factorization problem, and derive the corresponding update rules. Our simulation results show that the probabilistic non-negative matrix factorization (PNMF) algorithm is more accurate and more robust than its deterministic homologue in clustering cancer subtypes in a leukemia microarray dataset.
Keywords :
biology computing; cancer; genetics; matrix decomposition; pattern clustering; probability; stochastic processes; NMF algorithm; cancer subtype clustering; deterministic homologue; distinct molecular pattern identification; gene expression data clustering; gene expression profiles; hierarchical clustering; leukemia microarray dataset; multivariate data; probabilistic nonnegative matrix factorization algorithm; stochastic process; update rules; weighted regularized matrix factorization problem; Bioinformatics; Clustering algorithms; Gene expression; Probabilistic logic; Robustness; Signal to noise ratio;
Conference_Titel :
Genomic Signal Processing and Statistics (GENSIPS), 2011 IEEE International Workshop on
Conference_Location :
San Antonio, TX
Print_ISBN :
978-1-4673-0491-7
Electronic_ISBN :
2150-3001
DOI :
10.1109/GENSiPS.2011.6169465