Title :
Optimal Bayesian feature selection on high dimensional gene expression data
Author :
Pour, Ali Foroughi ; Dalton, Lori A.
Author_Institution :
Dept. of Electr. & Comput. Eng., Ohio State Univ., Columbus, OH, USA
Abstract :
Recent work proposes a Bayesian hierarchical model for feature selection in which priors are placed over the identity of each feature, as well as over the underlying feature-label distribution. Given data, Bayesian inference can be used to find a maximum posterior probability feature set. In this work, we examine the application of this theory to microarray data for biomarker discovery. A major challenge is in adapting the theory to very high-dimensional spaces, and we thus propose two suboptimal feature selection algorithms based on optimal Bayesian feature selection theory that perform very well with relatively low computational burden, thus being ideal for molecular biomarker discovery. We demonstrate in a synthetic microarray model that performance of the proposed methods are quite robust to the deviations from modeling assumptions, and in fact achieve outstanding performance relative to popular methods.
Keywords :
belief networks; bioinformatics; feature selection; genetics; inference mechanisms; Bayesian hierarchical model; Bayesian inference; feature identification; feature-label distribution; high-dimensional gene expression data; high-dimensional spaces; maximum posterior probability feature set; microarray data; molecular biomarker discovery; optimal Bayesian feature selection theory; suboptimal feature selection algorithms; synthetic microarray model; Bayes methods; Bioinformatics; Biological system modeling; Computational modeling; Data models; Genomics; Signal processing algorithms; Bayesian modeling; Feature selection; biomarker discovery;
Conference_Titel :
Signal and Information Processing (GlobalSIP), 2014 IEEE Global Conference on
Conference_Location :
Atlanta, GA
DOI :
10.1109/GlobalSIP.2014.7032358