• DocumentCode
    1549019
  • Title

    Discriminative Feature Selection by Nonparametric Bayes Error Minimization

  • Author

    Yang, Shuang-Hong ; Hu, Bao-Gang

  • Author_Institution
    Coll. of Comput., Georgia Inst. of Technol., Atlanta, GA, USA
  • Volume
    24
  • Issue
    8
  • fYear
    2012
  • Firstpage
    1422
  • Lastpage
    1434
  • Abstract
    Feature selection is fundamental to knowledge discovery from massive amount of high-dimensional data. In an effort to establish theoretical justification for feature selection algorithms, this paper presents a theoretically optimal criterion, namely, the discriminative optimal criterion (DoC) for feature selection. Compared with the existing representative optimal criterion (RoC, [CHECK END OF SENTENCE]) which retains maximum information for modeling the relationship between input and output variables, DoC is pragmatically advantageous because it attempts to directly maximize the classification accuracy and naturally reflects the Bayes error in the objective. To make DoC computationally tractable for practical tasks, we propose an algorithmic framework, which selects a subset of features by minimizing the Bayes error rate estimated by a nonparametric estimator. A set of existing algorithms as well as new ones can be derived naturally from this framework. As an example, we show that the Relief algorithm [CHECK END OF SENTENCE] greedily attempts to minimize the Bayes error estimated by the k-Nearest-Neighbor (kNN) method. This new interpretation insightfully reveals the secret behind the family of margin-based feature selection algorithms [CHECK END OF SENTENCE], [CHECK END OF SENTENCE] and also offers a principled way to establish new alternatives for performance improvement. In particular, by exploiting the proposed framework, we establish the Parzen-Relief (P-Relief) algorithm based on Parzen window estimator, and the MAP-Relief (M-Relief) which integrates label distribution into the max-margin objective to effectively handle imbalanced and multiclass data. Experiments on various benchmark data sets demonstrate the effectiveness of the proposed algorithms.
  • Keywords
    Bayes methods; data mining; estimation theory; Bayes error rate estimation; DoC; P-Relief; Parzen Relief algorithm; Relief algorithm; RoC; check end of sentence; discriminative feature selection; discriminative optimal criterion; high-dimensional data; k-nearest neighbor method; kNN; knowledge discovery; maximum information; maxmargin objective; nonparametric Bayes error minimization; representative optimal criterion; Algorithm design and analysis; Classification algorithms; Kernel; Minimization; Optimization; Search problems; Training; Feature selection; discriminative optimal criterion; feature weighting.;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2011.92
  • Filename
    6226559