DocumentCode :
1762600
Title :
A Unified Feature Selection Framework for Graph Embedding on High Dimensional Data
Author :
Chen, Marcus ; Tsang, Ivor W. ; Mingkui Tan ; Tat Jen Cham
Author_Institution :
Sch. of Comput. Eng., Nanyang Technol. Univ., Singapore, Singapore
Volume :
27
Issue :
6
fYear :
2015
fDate :
June 1 2015
Firstpage :
1465
Lastpage :
1477
Abstract :
Although graph embedding has been a powerful tool for modeling data intrinsic structures, simply employing all features for data structure discovery may result in noise amplification. This is particularly severe for high dimensional data with small samples. To meet this challenge, this paper proposes a novel efficient framework to perform feature selection for graph embedding, in which a category of graph embedding methods is cast as a least squares regression problem. In this framework, a binary feature selector is introduced to naturally handle the feature cardinality in the least squares formulation. The resultant integral programming problem is then relaxed into a convex Quadratically Constrained Quadratic Program (QCQP) learning problem, which can be efficiently solved via a sequence of accelerated proximal gradient (APG) methods. Since each APG optimization is w.r.t. only a subset of features, the proposed method is fast and memory efficient. The proposed framework is applied to several graph embedding learning problems, including supervised, unsupervised, and semi-supervised graph embedding. Experimental results on several high dimensional data demonstrated that the proposed method outperformed the considered state-of-the-art methods.
Keywords :
convex programming; feature selection; gradient methods; graph theory; learning (artificial intelligence); least squares approximations; quadratic programming; regression analysis; APG methods; APG optimization; accelerated proximal gradient methods; binary feature selector; convex QCQP learning problem; convex quadratically constrained quadratic program learning problem; data structure discovery; graph embedding learning problems; high dimensional data; integral programming problem; least squares regression problem; unified feature selection framework; Data structures; Educational institutions; Gain measurement; Optimization; Principal component analysis; Sparse matrices; Vectors; Sparse graph embedding; efficient feature selection; high dimensional data; sparse principal component analysis;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2014.2382599
Filename :
6990594
Link To Document :
بازگشت