• DocumentCode
    2774362
  • Title

    Feature Selection with High-Dimensional Imbalanced Data

  • Author

    Van Hulse, Jason ; Khoshgoftaar, Taghi M. ; Napolitano, Amri ; Wald, Randall

  • Author_Institution
    Dept. of Comput. & Electr. Eng. & Comput. Sci., Florida Atlantic Univ., Boca Raton, FL, USA
  • fYear
    2009
  • fDate
    6-6 Dec. 2009
  • Firstpage
    507
  • Lastpage
    514
  • Abstract
    Feature selection is an important topic in data mining, especially for high dimensional datasets. Filtering techniques in particular have received much attention, but detailed comparisons of their performance is lacking. This work considers three filters using classifier performance metrics and six commonly-used filters. All nine filtering techniques are compared and contrasted using five different microarray expression datasets. In addition, given that these datasets exhibit an imbalance between the number of positive and negative examples, the utilization of sampling techniques in the context of feature selection is examined.
  • Keywords
    data mining; feature extraction; information filtering; pattern classification; classifier performance metrics; data mining; feature selection; filtering technique; high dimensional imbalanced data; microarray expression dataset; Computer science; Conferences; Data analysis; Data mining; Diversity reception; Information filtering; Information filters; Measurement; Sampling methods; USA Councils;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining Workshops, 2009. ICDMW '09. IEEE International Conference on
  • Conference_Location
    Miami, FL
  • Print_ISBN
    978-1-4244-5384-9
  • Electronic_ISBN
    978-0-7695-3902-7
  • Type

    conf

  • DOI
    10.1109/ICDMW.2009.35
  • Filename
    5360460