DocumentCode :
3658034
Title :
An Empirical Study of Dynamic Incomplete-Case Nearest Neighbor Imputation in Software Quality Data
Author :
Jianglin Huang;Hongyi Sun;Yan-Fu Li;Min Xie
Author_Institution :
Dept. of Syst. Eng. &
fYear :
2015
Firstpage :
37
Lastpage :
42
Abstract :
Software quality prediction is an important yet difficult problem in software project development and management. Historical datasets can be used to build models for software quality prediction. However, the missing data significantly affects the prediction ability of models in knowledge discovery. Instead of ignoring missing observations, we investigate and improve incomplete-case k-nearest neighbor based imputation. K-nearest neighbor imputation is widely applied but has rarely been improved to have the most appropriate parameter settings for each imputation. This work conducts imputation on four well-known software quality datasets to discover the impact of the new imputation method we proposed. We compare it with mean imputation and other commonly used versions of k-nearest neighbor imputation. The empirical results show that the proposed dynamic incomplete-case nearest neighbor imputation performs better when the missingness is completely at random or non-ignorable, regardless of the percentage of missing values.
Keywords :
"Software quality","Nickel","Software engineering","Measurement","Estimation","Predictive models"
Publisher :
ieee
Conference_Titel :
Software Quality, Reliability and Security (QRS), 2015 IEEE International Conference on
Type :
conf
DOI :
10.1109/QRS.2015.16
Filename :
7272912
Link To Document :
بازگشت