مرکز منطقه ای اطلاع رساني علوم و فناوري - Variance Analysis in Software Fault Prediction Models

DocumentCode :

2795942

Title :

Variance Analysis in Software Fault Prediction Models

Author :

Jiang, Yue ; Lin, Jie ; Cukic, Bojan ; Menzies, Tim

Author_Institution :

Lane Dept. of Comput. Sci. & Electr. Eng., West Virginia Univ., Morgantown, WV, USA

fYear :

2009

fDate :

16-19 Nov. 2009

Firstpage :

Lastpage :

108

Abstract :

Software fault prediction models play an important role in software quality assurance. They identify software subsystems (modules,components, classes, or files) which are likely to contain faults. These subsystems, in turn, receive additional resources for verification and validation activities. Fault prediction models are binary classifiers typically developed using one of the supervised learning techniques from either a subset of the fault data from the current project or from a similar past project. In practice, it is critical that such models provide a reliable prediction performance on the data not used in training. Variance is an important reliability indicator of software fault prediction models. However, variance is often ignored or barely mentioned in many published studies. In this paper, through the analysis of twelve data sets from a public software engineering repository from the perspective of variance, we explore the following five questions regarding fault prediction models: (1) Do different types ofclassification performance measures exhibit different variance? (2) Does the size of the data set imply a more (or less) accurate prediction performance? (3) Does the size of training subset impact model´s stability? (4) Do different classifiers consistently exhibit different performance in terms of model´s variance? (5) Are there differences between variance from 1000 runs and 10 runs of 10-fold cross validation experiments? Our results indicate that variance is a very important factor in understanding fault prediction models and we recommend the best practice for reporting variance in empirical software engineering studies.

Keywords :

learning (artificial intelligence); software fault tolerance; software quality; statistical analysis; binary classifier; reliability indicator; software fault prediction model; software quality assurance; supervised learning; variance analysis; Analysis of variance; Data analysis; Fault diagnosis; Performance analysis; Predictive models; Software engineering; Software measurement; Software quality; Stability; Supervised learning; fault prediction models; machine learning; variance;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Software Reliability Engineering, 2009. ISSRE '09. 20th International Symposium on

Conference_Location :

Mysuru, Karnataka

ISSN :

1071-9458

Print_ISBN :

978-1-4244-5375-7

Electronic_ISBN :

1071-9458

Type :

conf

DOI :

10.1109/ISSRE.2009.13

Filename :

5362090

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2795942