An exploratory study about the cross-project defect prediction: Impact of using different classification algorithms and a measure of performance in building predictive models

Author

Ricardo F. P. Satin;Igor Scaliante Wiese;Reginaldo R?

Author_Institution

Departamento Acad?mico de Computa??o, Universidade Tecnol?gica Federal do Paran? - Campo Mour?o, Paran?, Brasil

fYear

2015

Firstpage

1

Lastpage

12

Abstract

Predicting defects in software projects is a complex task, especially in the initial phases of software development because there are a few available data. The use of cross-project defect prediction is indicated in such situation because it enables to reuse data of similar projects. In order to find and group similar projects, this paper proposes the construction of cross-project prediction models using a measure of performance achieved through the application of classification algorithms. To do so, we studied the combined application of different algorithms of classification, of feature selection, and clustering data, applied to 1270 projects aiming to building different cross-project prediction models. In this study we concluded that Naive Bayes algorithm obtained the best performance, with 31.58 % of satisfactory predictions in 19 models created with its use. This proposal seems to be promise, once the local predictions considered satisfactory reached 31.58%, against 26.31 % of global predictions.

Keywords

"Software","Predictive models","Computational modeling","Prediction algorithms","Clustering algorithms","Software algorithms","Buildings"

Publisher

ieee

Conference_Titel

Computing Conference (CLEI), 2015 Latin American

Type

conf

DOI

10.1109/CLEI.2015.7360033

Filename

7360033