Title :
Feature selection for unsupervised and supervised inference: the emergence of sparsity in a weighted-based approach
Author :
Wolf, Lior ; Shashua, Amnon
Author_Institution :
Sch. of Eng. & Comput. Sci., Hebrew Univ., Jerusalem, Israel
Abstract :
The problem of selecting a subset of relevant features in a potentially overwhelming quantity of data is classic and found in many branches of science including - examples in computer vision, text processing and more recently bioinformatics are abundant. We present a definition of "relevancy" based on spectral properties of the Affinity (or Laplacian) of the features\´ measurement matrix. The feature selection process is then based on a continuous ranking of the features defined by a least-squares optimization process. A remarkable property of the feature relevance function is that sparse solutions for the ranking values naturally emerge as a result of a "biased nonnegativity" of a key matrix in the process. As a result, a simple least-squares optimization process converges onto a sparse solution, i.e., a selection of a subset of features which form a local maxima over the relevance function. The feature selection algorithm can be embedded in both unsupervised and supervised inference problems and empirical evidence shows that the feature selections typically achieve high accuracy even when only a small fraction of the features are relevant.
Keywords :
feature extraction; learning (artificial intelligence); least mean squares methods; sparse matrices; feature selection algorithm; features measurement matrix; least-squares optimization; local maxima; sparse matrix; supervised inference; unsupervised inference; Bioinformatics; Computer vision; Engines; Face recognition; Sparse matrices; Speech recognition; Support vector machine classification; Support vector machines; Testing; Text processing;
Conference_Titel :
Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on
Conference_Location :
Nice, France
Print_ISBN :
0-7695-1950-4
DOI :
10.1109/ICCV.2003.1238369