DocumentCode
1762600
Title
A Unified Feature Selection Framework for Graph Embedding on High Dimensional Data
Author
Chen, Marcus ; Tsang, Ivor W. ; Mingkui Tan ; Tat Jen Cham
Author_Institution
Sch. of Comput. Eng., Nanyang Technol. Univ., Singapore, Singapore
Volume
27
Issue
6
fYear
2015
fDate
June 1 2015
Firstpage
1465
Lastpage
1477
Abstract
Although graph embedding has been a powerful tool for modeling data intrinsic structures, simply employing all features for data structure discovery may result in noise amplification. This is particularly severe for high dimensional data with small samples. To meet this challenge, this paper proposes a novel efficient framework to perform feature selection for graph embedding, in which a category of graph embedding methods is cast as a least squares regression problem. In this framework, a binary feature selector is introduced to naturally handle the feature cardinality in the least squares formulation. The resultant integral programming problem is then relaxed into a convex Quadratically Constrained Quadratic Program (QCQP) learning problem, which can be efficiently solved via a sequence of accelerated proximal gradient (APG) methods. Since each APG optimization is w.r.t. only a subset of features, the proposed method is fast and memory efficient. The proposed framework is applied to several graph embedding learning problems, including supervised, unsupervised, and semi-supervised graph embedding. Experimental results on several high dimensional data demonstrated that the proposed method outperformed the considered state-of-the-art methods.
Keywords
convex programming; feature selection; gradient methods; graph theory; learning (artificial intelligence); least squares approximations; quadratic programming; regression analysis; APG methods; APG optimization; accelerated proximal gradient methods; binary feature selector; convex QCQP learning problem; convex quadratically constrained quadratic program learning problem; data structure discovery; graph embedding learning problems; high dimensional data; integral programming problem; least squares regression problem; unified feature selection framework; Data structures; Educational institutions; Gain measurement; Optimization; Principal component analysis; Sparse matrices; Vectors; Sparse graph embedding; efficient feature selection; high dimensional data; sparse principal component analysis;
fLanguage
English
Journal_Title
Knowledge and Data Engineering, IEEE Transactions on
Publisher
ieee
ISSN
1041-4347
Type
jour
DOI
10.1109/TKDE.2014.2382599
Filename
6990594
Link To Document