Title :
An input variable importance definition based on empirical data probability and its use in variable selection
Author :
Lemaire, Vincent ; Clérot, Fabrice
Author_Institution :
DTL, France Telecom Res. & Dev., Lannion, France
Abstract :
Variable and feature selection have become the focus of much research in areas of application for which datasets with tens or hundreds of thousands of variables are available. We propose a new method to score subsets of variables according to their usefulness for the performance of a given model. This method is applicable on every kind of model and on classification or regression task. We assess the efficiency of the method with our results on the NIPS 2003 feature selection challenge and with an example of a real application.
Keywords :
error statistics; feature extraction; neural nets; pattern classification; probability; regression analysis; set theory; NIPS 2003 feature selection; artificial neural networks; classification task; empirical data probability; error statistics; regression task; variable subset selection; Filters; Input variables; Performance evaluation; Predictive models; Research and development; Spatial databases; Telecommunications; Warehousing;
Conference_Titel :
Neural Networks, 2004. Proceedings. 2004 IEEE International Joint Conference on
Print_ISBN :
0-7803-8359-1
DOI :
10.1109/IJCNN.2004.1380149