Abstract :
Bugs are prevalent. To improve software quality, developers often allow users to report bugs that they found using a bug tracking system such as Bugzilla. Users would specify among other things, a description of the bug, the component that is affected by the bug, and the severity of the bug. Based on this information, bug triagers would then assign a priority level to the reported bug. As resources are limited, bug reports would be investigated based on their priority levels. This priority assignment process however is a manual one. Could we do better? In this paper, we propose an automated approach based on machine learning that would recommend a priority level based on information available in bug reports. Our approach considers multiple factors, temporal, textual, author, related-report, severity, and product, that potentially affect the priority level of a bug report. These factors are extracted as features which are then used to train a discriminative model via a new classification algorithm that handles ordinal class labels and imbalanced data. Experiments on more than a hundred thousands bug reports from Eclipse show that we can outperform baseline approaches in terms of average F-measure by a relative improvement of 58.61%.
Keywords :
feature extraction; learning (artificial intelligence); pattern classification; program debugging; software quality; Bugzilla; DRONE; Eclipse; author factor; automated approach; average F-measure; bug tracking system; classification algorithm; discriminative model; feature extraction; machine learning; multifactor analysis; priority assignment process; priority level prediction; product factor; related-report factor; reported bugs; severity factor; software quality; temporal factor; textual factor; Computer bugs; Engines; Feature extraction; Linear regression; Software systems; Standards; Training;