Title :
The Fisher-Markov Selector: Fast Selecting Maximally Separable Feature Subset for Multiclass Classification with Applications to High-Dimensional Data
Author :
Cheng, Qiang ; Zhou, Hongbo ; Cheng, Jie
Author_Institution :
Dept. of Comput. Sci., Southern Illinois Univ. Carbondale, Carbondale, IL, USA
fDate :
6/1/2011 12:00:00 AM
Abstract :
Selecting features for multiclass classification is a critically important task for pattern recognition and machine learning applications. Especially challenging is selecting an optimal subset of features from high-dimensional data, which typically have many more variables than observations and contain significant noise, missing components, or outliers. Existing methods either cannot handle high-dimensional data efficiently or scalably, or can only obtain local optimum instead of global optimum. Toward the selection of the globally optimal subset of features efficiently, we introduce a new selector - which we call the Fisher-Markov selector - to identify those features that are the most useful in describing essential differences among the possible groups. In particular, in this paper we present a way to represent essential discriminating characteristics together with the sparsity as an optimization objective. With properly identified measures for the sparseness and discriminativeness in possibly high-dimensional settings, we take a systematic approach for optimizing the measures to choose the best feature subset. We use Markov random field optimization techniques to solve the formulated objective functions for simultaneous feature selection. Our results are noncombinatorial, and they can achieve the exact global optimum of the objective function for some special kernels. The method is fast; in particular, it can be linear in the number of features and quadratic in the number of observations. We apply our procedure to a variety of real-world data, including mid--dimensional optical handwritten digit data set and high-dimensional microarray gene expression data sets. The effectiveness of our method is confirmed by experimental results. In pattern recognition and from a model selection viewpoint, our procedure says that it is possible to select the most discriminating subset of variables by solving a very simple unconstrained objective function which in fact can be ob- ained with an explicit expression.
Keywords :
Markov processes; learning (artificial intelligence); optimisation; pattern classification; Fisher-Markov selector; Markov random field optimization techniques; fast selecting maximally separable feature subset; high dimensional data; high dimensional microarray gene expression data sets; machine learning applications; middimensional optical handwritten digit data set; multiclass classification; pattern recognition; Complexity theory; Feature extraction; Kernel; Markov processes; Noise; Optimization; Pattern recognition; Classification; Fisher´s linear discriminant analysis; Markov random field.; feature subset selection; high-dimensional data; kernel; Algorithms; Artificial Intelligence; Discriminant Analysis; Female; Humans; Image Interpretation, Computer-Assisted; Imaging, Three-Dimensional; Leukemia; Lung Neoplasms; Male; Markov Chains; Multivariate Analysis; Pattern Recognition, Automated; Prostatic Neoplasms; Software;
Journal_Title :
Pattern Analysis and Machine Intelligence, IEEE Transactions on
DOI :
10.1109/TPAMI.2010.195