DocumentCode :
2143418
Title :
A Comparative Study of Threshold-Based Feature Selection Techniques
Author :
Wang, Huanjing ; Khoshgoftaar, Taghi M. ; Van Hulse, Jason
Author_Institution :
Western Kentucky Univ., KY, USA
fYear :
2010
fDate :
14-16 Aug. 2010
Firstpage :
499
Lastpage :
504
Abstract :
Given high-dimensional software measurement data, researchers and practitioners often use feature (metric) selection techniques to improve the performance of software quality classification models. This paper presents our newly proposed threshold-based feature selection techniques, comparing the performance of these techniques by building classification models using five commonly used classifiers. In order to evaluate the effectiveness of different feature selection techniques, the models are evaluated using eight different performance metrics separately since a given performance metric usually captures only one aspect of the classification performance. All experiments are conducted on three Eclipse data sets with different levels of class imbalance. The experiments demonstrate that the choice of a performance metric may significantly influence the results. In this study, we have found four distinct patterns when utilizing eight performance metrics to order 11 threshold-based feature selection techniques. Moreover, performances of the software quality models either improve or remain unchanged despite the removal of over 96% of the software metrics (attributes).
Keywords :
software metrics; software quality; Eclipse data sets; high-dimensional software measurement data; performance metrics; software metrics; software quality classification models; threshold-based feature selection techniques; Analysis of variance; Data models; Software; Software metrics; Support vector machines; Training data; classification; performance metrics; software metrics; threshold-based feature selection technique;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Granular Computing (GrC), 2010 IEEE International Conference on
Conference_Location :
San Jose, CA
Print_ISBN :
978-1-4244-7964-1
Type :
conf
DOI :
10.1109/GrC.2010.104
Filename :
5575976
Link To Document :
بازگشت