• DocumentCode
    37995
  • Title

    Clustering-Guided Sparse Structural Learning for Unsupervised Feature Selection

  • Author

    Zechao Li ; Jing Liu ; Yi Yang ; Xiaofang Zhou ; Hanqing Lu

  • Author_Institution
    Sch. of Comput. Sci. & Eng., Nanjing Univ. of Sci. & Technol., Nanjing, China
  • Volume
    26
  • Issue
    9
  • fYear
    2014
  • fDate
    Sept. 2014
  • Firstpage
    2138
  • Lastpage
    2150
  • Abstract
    Many pattern analysis and data mining problems have witnessed high-dimensional data represented by a large number of features, which are often redundant and noisy. Feature selection is one main technique for dimensionality reduction that involves identifying a subset of the most useful features. In this paper, a novel unsupervised feature selection algorithm, named clustering-guided sparse structural learning (CGSSL), is proposed by integrating cluster analysis and sparse structural analysis into a joint framework and experimentally evaluated. Nonnegative spectral clustering is developed to learn more accurate cluster labels of the input samples, which guide feature selection simultaneously. Meanwhile, the cluster labels are also predicted by exploiting the hidden structure shared by different features, which can uncover feature correlations to make the results more reliable. Row-wise sparse models are leveraged to make the proposed model suitable for feature selection. To optimize the proposed formulation, we propose an efficient iterative algorithm. Finally, extensive experiments are conducted on 12 diverse benchmarks, including face data, handwritten digit data, document data, and biomedical data. The encouraging experimental results in comparison with several representative algorithms and the theoretical analysis demonstrate the efficiency and effectiveness of the proposed algorithm for feature selection.
  • Keywords
    feature selection; iterative methods; learning (artificial intelligence); pattern clustering; CGSSL; biomedical data; cluster analysis; clustering-guided sparse structural learning; document data; face data; feature correlations; feature selection; handwritten digit data; hidden structure; iterative algorithm; nonnegative spectral clustering; row-wise sparse models; sparse structural analysis; unsupervised feature selection; Algorithm design and analysis; Clustering algorithms; Correlation; Integrated circuits; Machine learning algorithms; Optimization; Prediction algorithms; Clustering; Computing Methodologies; Database Applications; Database Management; Design Methodology; Feature evaluation and selection; Feature selection; Information Technology and Systems; Pattern Recognition; and association rules; classification; latent structure; nonnegative spectral clustering; row-sparsity;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2013.65
  • Filename
    6509368