• DocumentCode
    25976
  • Title

    A Bootstrap Based Neyman-Pearson Test for Identifying Variable Importance

  • Author

    Ditzler, Gregory ; Polikar, Robi ; Rosen, Gail

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Drexel Univ., Philadelphia, PA, USA
  • Volume
    26
  • Issue
    4
  • fYear
    2015
  • fDate
    Apr-15
  • Firstpage
    880
  • Lastpage
    886
  • Abstract
    Selection of most informative features that leads to a small loss on future data are arguably one of the most important steps in classification, data analysis and model selection. Several feature selection (FS) algorithms are available; however, due to noise present in any data set, FS algorithms are typically accompanied by an appropriate cross-validation scheme. In this brief, we propose a statistical hypothesis test derived from the Neyman-Pearson lemma for determining if a feature is statistically relevant. The proposed approach can be applied as a wrapper to any FS algorithm, regardless of the FS criteria used by that algorithm, to determine whether a feature belongs in the relevant set. Perhaps more importantly, this procedure efficiently determines the number of relevant features given an initial starting point. We provide freely available software implementations of the proposed methodology.
  • Keywords
    data analysis; feature selection; pattern classification; statistical distributions; FS; bootstrap based Neyman-Pearson test; data analysis; data classification; feature selection; model selection; statistical distribution; Feature extraction; Indexes; Learning systems; Linear programming; Random variables; Testing; Vectors; Feature selection (FS); Neyman-Pearson; Neyman-Pearson.;
  • fLanguage
    English
  • Journal_Title
    Neural Networks and Learning Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    2162-237X
  • Type

    jour

  • DOI
    10.1109/TNNLS.2014.2320415
  • Filename
    6823119