Title :
Using public domain metrics to estimate software development effort
Author :
Jeffery, Ross ; Ruhe, Melanie ; Wieczorek, Isabella
Author_Institution :
Centre for Adv. Empirical Software Res., New South Wales Univ., Sydney, NSW, Australia
Abstract :
The authors investigate the accuracy of cost estimates when applying most commonly used modeling techniques to a large-scale industrial data set which is professionally maintained by the International Software Standards Benchmarking Group (ISBSG). The modeling techniques applied are ordinary least squares regression (OLS), analogy based estimation, stepwise ANOVA, CART, and robust regression. The questions addresses in the study are related to important issues. The first is the appropriate selection of a technique in a given context. The second is the assessment of the feasibility of using multi-organizational data compared to the benefits from company-specific data collection. We compare company-specific models with models based on multi-company data. This is done by using the estimates derived for one company that contributed to the ISBSG data set and estimates from using carefully matched data from the rest of the ISBSG data. When using the ISBSG data set to derive estimates for the company, generally poor results were obtained. Robust regression and OLS performed most accurately. When using the company´s own data as the basis for estimation, OLS, a CART-variant, and analogy performed best. In contrast to previous studies, the estimation accuracy when using the company´s data is significantly higher than when using the rest of the ISBSG data set. Thus, from these results, the company that contributed to the ISBSG data set, would be better off when using its own data for cost estimation
Keywords :
data analysis; least squares approximations; software cost estimation; software metrics; CART; CART-variant; ISBSG data set; International Software Standards Benchmarking Group; analogy based estimation; company-specific data collection; company-specific models; cost estimation; estimation accuracy; large-scale industrial data set; modeling techniques; multi-company data; multi-organizational data; ordinary least squares regression; public domain metrics; robust regression; software cost estimates; software development effort estimation; stepwise ANOVA; Analysis of variance; Computer industry; Computer science; Costs; Data engineering; Large-scale systems; Least squares approximation; Programming; Robustness; Software standards;
Conference_Titel :
Software Metrics Symposium, 2001. METRICS 2001. Proceedings. Seventh International
Conference_Location :
London
Print_ISBN :
0-7695-1043-4
DOI :
10.1109/METRIC.2001.915512