• DocumentCode
    1762600
  • Title

    A Unified Feature Selection Framework for Graph Embedding on High Dimensional Data

  • Author

    Chen, Marcus ; Tsang, Ivor W. ; Mingkui Tan ; Tat Jen Cham

  • Author_Institution
    Sch. of Comput. Eng., Nanyang Technol. Univ., Singapore, Singapore
  • Volume
    27
  • Issue
    6
  • fYear
    2015
  • fDate
    June 1 2015
  • Firstpage
    1465
  • Lastpage
    1477
  • Abstract
    Although graph embedding has been a powerful tool for modeling data intrinsic structures, simply employing all features for data structure discovery may result in noise amplification. This is particularly severe for high dimensional data with small samples. To meet this challenge, this paper proposes a novel efficient framework to perform feature selection for graph embedding, in which a category of graph embedding methods is cast as a least squares regression problem. In this framework, a binary feature selector is introduced to naturally handle the feature cardinality in the least squares formulation. The resultant integral programming problem is then relaxed into a convex Quadratically Constrained Quadratic Program (QCQP) learning problem, which can be efficiently solved via a sequence of accelerated proximal gradient (APG) methods. Since each APG optimization is w.r.t. only a subset of features, the proposed method is fast and memory efficient. The proposed framework is applied to several graph embedding learning problems, including supervised, unsupervised, and semi-supervised graph embedding. Experimental results on several high dimensional data demonstrated that the proposed method outperformed the considered state-of-the-art methods.
  • Keywords
    convex programming; feature selection; gradient methods; graph theory; learning (artificial intelligence); least squares approximations; quadratic programming; regression analysis; APG methods; APG optimization; accelerated proximal gradient methods; binary feature selector; convex QCQP learning problem; convex quadratically constrained quadratic program learning problem; data structure discovery; graph embedding learning problems; high dimensional data; integral programming problem; least squares regression problem; unified feature selection framework; Data structures; Educational institutions; Gain measurement; Optimization; Principal component analysis; Sparse matrices; Vectors; Sparse graph embedding; efficient feature selection; high dimensional data; sparse principal component analysis;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2014.2382599
  • Filename
    6990594