DocumentCode :
2856782
Title :
High-Dimensional Software Engineering Data and Feature Selection
Author :
Wang, Huanjing ; Khoshgoftaar, Taghi M. ; Gao, Kehan ; Seliya, Naeem
Author_Institution :
Western Kentucky Univ., Bowling Green, KY, USA
fYear :
2009
fDate :
2-4 Nov. 2009
Firstpage :
83
Lastpage :
90
Abstract :
Software metrics collected during project development play a critical role in software quality assurance. A software practitioner is very keen on learning which software metrics to focus on for software quality prediction. While a concise set of software metrics is often desired, a typical project collects a very large number of metrics. Minimal attention has been devoted to finding the minimum set of software metrics that have the same predictive capability as a larger set of metrics - we strive to answer that question in this paper. We present a comprehensive comparison between seven commonly-used filter-based feature ranking techniques (FRT) and our proposed hybrid feature selection (HFS) technique. Our case study consists of a very high-dimensional (42 software attributes) software measurement data set obtained from a large telecommunications system. The empirical analysis indicates that HFS performs better than FRT; however, the Kolmogorov-Smirnov feature ranking technique demonstrates competitive performance. For the telecommunications system, it is found that only 10% of the software attributes are sufficient for effective software quality prediction.
Keywords :
software metrics; software quality; Kolmogorov-Smirnov feature ranking technique; feature ranking techniques; feature selection; high-dimensional software engineering data; hybrid feature selection; large telecommunications system; project development; software attributes; software metrics; software practitioner; software quality assurance; Artificial intelligence; Data mining; Filters; Machine learning; Power system modeling; Predictive models; Software engineering; Software measurement; Software metrics; Software quality; feature ranking; high-dimensional data; hybrid feature selection; quality prediction; software metrics;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Tools with Artificial Intelligence, 2009. ICTAI '09. 21st International Conference on
Conference_Location :
Newark, NJ
ISSN :
1082-3409
Print_ISBN :
978-1-4244-5619-2
Electronic_ISBN :
1082-3409
Type :
conf
DOI :
10.1109/ICTAI.2009.20
Filename :
5365755
Link To Document :
بازگشت