DocumentCode :
2600460
Title :
Local vs. global models for effort estimation and defect prediction
Author :
Menzies, Tim ; Butcher, Andrew ; Marcus, Andrian ; Zimmermann, Thomas ; Cok, David
Author_Institution :
CS & EE, WVU, Morgantown, WV, USA
fYear :
2011
fDate :
6-10 Nov. 2011
Firstpage :
343
Lastpage :
351
Abstract :
Data miners can infer rules showing how to improve either (a) the effort estimates of a project or (b) the defect predictions of a software module. Such studies often exhibit conclusion instability regarding what is the most effective action for different projects or modules. This instability can be explained by data heterogeneity. We show that effort and defect data contain many local regions with markedly different properties to the global space. In other words, what appears to be useful in a global context is often irrelevant for particular local contexts. This result raises questions about the generality of conclusions from empirical SE. At the very least, SE researchers should test if their supposedly general conclusions are valid within subsets of their data. At the very most, empirical SE should become a search for local regions with similar properties (and conclusions should be constrained to just those regions).
Keywords :
data mining; program diagnostics; conclusion instability; data heterogeneity; data mining; effort estimation; global models; local models; software module defect prediction; Context; Couplings; Estimation; Principal component analysis; Runtime; Software; USA Councils; Data mining; defect/effort estimation; empirical SE; validation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Automated Software Engineering (ASE), 2011 26th IEEE/ACM International Conference on
Conference_Location :
Lawrence, KS
ISSN :
1938-4300
Print_ISBN :
978-1-4577-1638-6
Type :
conf
DOI :
10.1109/ASE.2011.6100072
Filename :
6100072
Link To Document :
بازگشت