Title :
A modeling approach to feature selection
Author :
Sheinvald, Jacob ; Dom, B. ; Niblack, Wayne
Author_Institution :
IBM, San Jose, CA, USA
Abstract :
An information-theoretic approach is used to derive a new feature selection criterion capable of detecting features that are totally useless. Since the number of useless features is initially unknown, traditional class-separability and distance measures are not capable of coping with this problem. The useless feature-subset is detected by fitting a probability model to a given training set of classified feature-vectors using the minimum-description-length criterion (MDLC) for model selection. The resulting criterion for the Gaussian case is a simple closed-form expression, having a plausible geometric interpretation, and is proved to be consistent, i.e., it yields the true useless subset with probability 1 as the size of the training set grows to infinity. Simulations show excellent results compared to the cross-validation method and other information-theoretic criteria, even for small-sized training sets
Keywords :
information theory; pattern recognition; picture processing; probability; Gaussian case; feature selection; geometric interpretation; information theory; minimum-description-length criterion; modeling; probability model; useless feature-subset; Bayesian methods; Closed-form solution; Computer vision; Error analysis; H infinity control; Hardware; Jacobian matrices; Noise measurement; Training data;
Conference_Titel :
Pattern Recognition, 1990. Proceedings., 10th International Conference on
Conference_Location :
Atlantic City, NJ
Print_ISBN :
0-8186-2062-5
DOI :
10.1109/ICPR.1990.118160