DocumentCode :
2652519
Title :
Measuring Stability of Threshold-Based Feature Selection Techniques
Author :
Wang, Huanjing ; Khoshgoftaar, Taghi M.
Author_Institution :
Western Kentucky Univ., Bowling Green, KY, USA
fYear :
2011
fDate :
7-9 Nov. 2011
Firstpage :
986
Lastpage :
993
Abstract :
Feature selection has been applied in many domains, such as text mining and software engineering. Ideally a feature selection technique should produce consistent outputs regardless of minor variations in the input data. Researchers have recently begun to examine the stability (robustness) of feature selection techniques. The stability of a feature selection method is defined as the degree of agreement between its outputs to randomly-selected subsets of the same input data. This study evaluated the stability of 11 threshold-based feature ranking techniques (rankers) when applied to 16 real-world software measurement datasets of different sizes. Experimental results demonstrate that AUC (Area Under the Receiver Operating Characteristic Curve) and PRC (Area Under the Precision-Recall Curve) performed best among the 11 rankers.
Keywords :
data handling; software metrics; area under the precision-recall curve; area under the receiver operating characteristic curve; randomly selected subsets; software engineering; software measurement datasets; stability measurement; text mining; threshold based feature selection techniques; Computational modeling; Indexes; Measurement; Robustness; Software; Stability criteria; robustness; software metrics; stability; threshold-based feature ranking;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Tools with Artificial Intelligence (ICTAI), 2011 23rd IEEE International Conference on
Conference_Location :
Boca Raton, FL
ISSN :
1082-3409
Print_ISBN :
978-1-4577-2068-0
Electronic_ISBN :
1082-3409
Type :
conf
DOI :
10.1109/ICTAI.2011.169
Filename :
6103460
Link To Document :
بازگشت