DocumentCode :
237283
Title :
An Improved Discriminative Model for Duplication Detection on Bug Reports with Cluster Weighting
Author :
Meng-Jie Lin ; Cheng-Zen Yang
Author_Institution :
Dept. of Comput. Sci. & Eng., Yuan Ze Univ., Chungli, Taiwan
fYear :
2014
fDate :
21-25 July 2014
Firstpage :
117
Lastpage :
122
Abstract :
Processing bug reports plays an important role for software maintenance. Recently, the issue of detecting duplicate bug reports has been noticed due to their considerable appearances. In the past, many NLP-based detection schemes have been proposed. However, the cluster-level correlation relationships are not extensively considered in the past studies. In this paper, we present an improved detection scheme using cluster weighting to enhance the detection performance of a previous SVM-based method. We have conducted empirical studies with three open source software projects, Apache, ArgoUML, and SVN. Compared with the original SVM-based method, the proposed SVM-TC scheme can achieve 2.83-16.32% improvements of the top-5 recall rates in three projects.
Keywords :
natural language processing; pattern clustering; program debugging; public domain software; software maintenance; support vector machines; Apache; ArgoUML; NLP-based detection scheme; SVM-TC scheme; SVM-based method; SVN; cluster weighting; cluster-level correlation relationship; detection performance; discriminative model; duplicate bug reports; duplication detection; open source software project; software maintenance; Correlation; Feature extraction; Mathematical model; Software; Support vector machines; Training; Vectors; Bug Reports; Cluster Weighting; Duplication Detection; Empirical Study;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Software and Applications Conference (COMPSAC), 2014 IEEE 38th Annual
Conference_Location :
Vasteras
Type :
conf
DOI :
10.1109/COMPSAC.2014.18
Filename :
6899208
Link To Document :
بازگشت