Title of article :
Local versus Global Models for Just-In-Time Software Defect Prediction
Author/Authors :
Yang, Xingguang Department of Computer Science and Engineering - East China University of Science and Technology, China , Yu, Huiqun Department of Computer Science and Engineering - East China University of Science and Technology, China , Fan, Guisheng Department of Computer Science and Engineering - East China University of Science and Technology, China , Shi, Kai Department of Computer Science and Engineering - East China University of Science and Technology, China , Chen, Liqiong Department of Computer Science and Information Engineering - Shanghai Institute of Technology, China
Pages :
14
From page :
1
To page :
14
Abstract :
Just-in-time software defect prediction (JIT-SDP) is an active topic in software defect prediction, which aims to identify defect-inducing changes. Recently, some studies have found that the variability of defect data sets can affect the performance of defect predictors. By using local models, it can help improve the performance of prediction models. However, previous studies have focused on module-level defect prediction. Whether local models are still valid in the context of JIT-SDP is an important issue. To this end, we compare the performance of local and global models through a large-scale empirical study based on six open-source projects with 227417 changes. The experiment considers three evaluation scenarios of cross-validation, cross-project-validation, and timewise-cross-validation. To build local models, the experiment uses the k-medoids to divide the training set into several homogeneous regions. In addition, logistic regression and effort-aware linear regression (EALR) are used to build classification models and effort-aware prediction models, respectively. The empirical results show that local models perform worse than global models in the classification performance. However, local models have significantly better effort-aware prediction performance than global models in the cross-validation and cross-project-validation scenarios. Particularly, when the number of clusters k is set to 2, local models can obtain optimal effort-aware prediction performance. Therefore, local models are promising for effort-aware JIT-SDP.
Keywords :
Global Models , Defect Prediction , Time Software
Journal title :
Scientific Programming
Serial Year :
2019
Full Text URL :
Record number :
2611513
Link To Document :
بازگشت