DocumentCode :
1545905
Title :
Software cost estimation with incomplete data
Author :
Strike, Kevin ; El Emam, Khaled ; Madhavji, Nazim
Author_Institution :
Sch. of Comput. Sci., McGill Univ., Montreal, Que., Canada
Volume :
27
Issue :
10
fYear :
2001
fDate :
10/1/2001 12:00:00 AM
Firstpage :
890
Lastpage :
908
Abstract :
The construction of software cost estimation models remains an active topic of research. The basic premise of cost modeling is that a historical database of software project cost data can be used to develop a quantitative model to predict the cost of future projects. One of the difficulties faced by workers in this area is that many of these historical databases contain substantial amounts of missing data. Thus far, the common practice has been to ignore observations with missing data. In principle, such a practice can lead to gross biases and may be detrimental to the accuracy of cost estimation models. We describe an extensive simulation where we evaluate different techniques for dealing with missing data in the context of software cost modeling. Three techniques are evaluated: listwise deletion, mean imputation, and eight different types of hot-deck imputation. Our results indicate that all the missing data techniques perform well with small biases and high precision. This suggests that the simplest technique, listwise deletion, is a reasonable choice. However, this will not necessarily provide the best performance. Consistent best performance (minimal bias and highest precision) can be obtained by using hot-deck imputation with Euclidean distance and a z-score standardization
Keywords :
data integrity; database management systems; software cost estimation; Euclidean distance; cost modeling; data quality; database; hot-deck imputation; incomplete data; listwise deletion; mean imputation; missing data; quantitative model; simulation; software cost estimation; software project cost data; z-score standardization; Context modeling; Costs; Databases; Euclidean distance; Predictive models; Productivity; Project management; Size measurement; Software performance; Standardization;
fLanguage :
English
Journal_Title :
Software Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
0098-5589
Type :
jour
DOI :
10.1109/32.962560
Filename :
962560
Link To Document :
بازگشت