Author_Institution :
State Key Lab. of Comput. Sci., Inst. of Software, Beijing, China
Abstract :
In this paper, we consider the problem of unsupervised feature selection. Recently, spectral feature selection algorithms, which leverage both graph Laplacian and spectral regression, have received increasing attention. However, existing spectral feature selection algorithms suffer from two major problems: 1) since the graph Laplacian is constructed from the original feature space, noisy and irrelevant features may have adverse effect on the estimated graph Laplacian and hence degenerate the quality of the induced graph embedding, 2) since the cluster labels are discrete in natural, relaxing and approximating these labels into a continuous embedding can inevitably introduce noise into the estimated cluster labels. Without considering the noise in the cluster labels, the feature selection process may be misguided. In this paper, we propose a Robust Spectral learning framework for unsupervised Feature Selection (RSFS), which jointly improves the robustness of graph embedding and sparse spectral regression. Compared with existing methods which are sensitive to noisy features, our proposed method utilizes a robust local learning method to construct the graph Laplacian and a robust spectral regression method to handle the noise on the learned cluster labels. In order to solve the proposed optimization problem, an efficient iterative algorithm is proposed. We also show the close connection between the proposed robust spectral regression and robust Huber M-estimator. Experimental results on different datasets show the superiority of RSFS.
Keywords :
feature selection; graph theory; iterative methods; pattern clustering; regression analysis; unsupervised learning; RSFS; estimated cluster labels; graph Laplacian; induced graph embedding; iterative algorithm; optimization problem; robust Huber M-estimator; robust local learning method; robust spectral learning framework for unsupervised feature selection; robust spectral regression method; sparse spectral regression; spectral feature selection algorithms; Accuracy; Clustering algorithms; Laplace equations; Mutual information; Noise; Optimization; Robustness;