Title :
A common factor-analytic model for classification
Author :
Mingzhu Sun ; McLachlan, Geoffrey J.
Author_Institution :
Dept. of Math., Univ. of Queensland, Brisbane, QLD, Australia
Abstract :
In this era of data explosion, much research has been directed to the problem of filtering and extracting useful information from extremely large datasets. The focus is on discriminant analysis of high-dimensional data, where the number of dimensions p is very large relative to the number of observations n. Mixture discriminant analysis provides an effective parametric approach, where each class density is modeled using mixtures of common factor analyzers. Although the adoption of mixture models with common factor loadings in the components significantly reduces the number of parameters to be estimated, the number of variables has to be reduced first to a more manageable level. Thus we consider the problem of dimension reduction for high-dimensional data. In this paper, we propose a factor-analytic model with common factor loadings for classification. We apply our model to a breast cancer study involving microarray gene expression data, which shows the parametric approach can select informative genes that improve the prediction of disease outcome.
Keywords :
cancer; data analysis; genetics; medical information systems; pattern classification; breast cancer; common factor-analytic model; disease; high-dimensional data dimension reduction; high-dimensional data discriminant analysis; informative gene selection; microarray gene expression data; mixture discriminant analysis; mixture model; parameter estimation; Analytical models; Breast cancer; Computational modeling; Covariance matrices; Error analysis; Load modeling; Loading;
Conference_Titel :
Bioinformatics and Biomedicine (BIBM), 2013 IEEE International Conference on
Conference_Location :
Shanghai
DOI :
10.1109/BIBM.2013.6732722