Title :
A Comparison of Software Fault Imputation Procedures
Author :
Van Hulse, Jason ; Khoshgoftaar, Taghi M. ; Seiffert, Chris
Author_Institution :
Dept. of Comput. Sci. & Eng., Florida Atlantic Univ., Boca Raton, FL
Abstract :
This work presents a detailed comparison of three imputation techniques, Bayesian multiple imputation, regression imputation and k nearest neighbor imputation, at various missingness levels. Starting with a complete real-world software measurement dataset called CCCS, missing values were injected into the dependent variable at four levels according to three different missingness mechanisms. The three imputation techniques are evaluated by comparing the imputed and actual values. Our analysis includes a three-way analysis of variance (ANOVA) model, which demonstrates that Bayesian multiple imputation obtains the best performance, followed closely by regression
Keywords :
Bayes methods; regression analysis; software fault tolerance; software metrics; Bayesian multiple imputation; CCCS; real-world software measurement dataset; regression imputation; software fault imputation; three-way analysis of variance model; Analysis of variance; Bayesian methods; Computer science; Data mining; Linear regression; Military communication; Nearest neighbor searches; Performance analysis; Software engineering; Software measurement;
Conference_Titel :
Machine Learning and Applications, 2006. ICMLA '06. 5th International Conference on
Conference_Location :
Orlando, FL
Print_ISBN :
0-7695-2735-3
DOI :
10.1109/ICMLA.2006.5