Title of article :
Class noise detection based on software metrics and ROC curves
Author/Authors :
Cagatay Catal، نويسنده , , Oral Alan، نويسنده , , Kerime Balkan، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2011
Abstract :
Noise detection for software measurement datasets is a topic of growing interest. The presence of class and attribute noise in software measurement datasets degrades the performance of machine learning-based classifiers, and the identification of these noisy modules improves the overall performance. In this study, we propose a noise detection algorithm based on software metrics threshold values. The threshold values are obtained from the Receiver Operating Characteristic (ROC) analysis. This paper focuses on case studies of five public NASA datasets and details the construction of Naive Bayes-based software fault prediction models both before and after applying the proposed noise detection algorithm. Experimental results show that this noise detection approach is very effective for detecting the class noise and that the performance of fault predictors using a Naive Bayes algorithm with a logNum filter improves if the class labels of identified noisy modules are corrected.
Keywords :
software quality , Noise detection , Receiver operating characteristic (ROC) curve , Software Metrics , Software fault prediction , Metric threshold values
Journal title :
Information Sciences
Journal title :
Information Sciences