DocumentCode
2774362
Title
Feature Selection with High-Dimensional Imbalanced Data
Author
Van Hulse, Jason ; Khoshgoftaar, Taghi M. ; Napolitano, Amri ; Wald, Randall
Author_Institution
Dept. of Comput. & Electr. Eng. & Comput. Sci., Florida Atlantic Univ., Boca Raton, FL, USA
fYear
2009
fDate
6-6 Dec. 2009
Firstpage
507
Lastpage
514
Abstract
Feature selection is an important topic in data mining, especially for high dimensional datasets. Filtering techniques in particular have received much attention, but detailed comparisons of their performance is lacking. This work considers three filters using classifier performance metrics and six commonly-used filters. All nine filtering techniques are compared and contrasted using five different microarray expression datasets. In addition, given that these datasets exhibit an imbalance between the number of positive and negative examples, the utilization of sampling techniques in the context of feature selection is examined.
Keywords
data mining; feature extraction; information filtering; pattern classification; classifier performance metrics; data mining; feature selection; filtering technique; high dimensional imbalanced data; microarray expression dataset; Computer science; Conferences; Data analysis; Data mining; Diversity reception; Information filtering; Information filters; Measurement; Sampling methods; USA Councils;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining Workshops, 2009. ICDMW '09. IEEE International Conference on
Conference_Location
Miami, FL
Print_ISBN
978-1-4244-5384-9
Electronic_ISBN
978-0-7695-3902-7
Type
conf
DOI
10.1109/ICDMW.2009.35
Filename
5360460
Link To Document