DocumentCode
2897460
Title
Automated duplicate detection for bug tracking systems
Author
Jalbert, Nicholas ; Weimer, Westley
Author_Institution
Univ. of Virginia, Charlottesville, VA
fYear
2008
fDate
24-27 June 2008
Firstpage
52
Lastpage
61
Abstract
Bug tracking systems are important tools that guide the maintenance activities of software developers. The utility of these systems is hampered by an excessive number of duplicate bug reports-in some projects as many as a quarter of all reports are duplicates. Developers must manually identify duplicate bug reports, but this identification process is time-consuming and exacerbates the already high cost of software maintenance. We propose a system that automatically classifies duplicate bug reports as they arrive to save developer time. This system uses surface features, textual semantics, and graph clustering to predict duplicate status. Using a dataset of 29,000 bug reports from the Mozilla project, we perform experiments that include a simulation of a real-time bug reporting environment. Our system is able to reduce development cost by filtering out 8% of duplicate bug reports while allowing at least one report for each real defect to reach developers.
Keywords
graph theory; pattern classification; pattern clustering; program debugging; software maintenance; software tools; tracking; automated duplicate bug report detection; graph clustering; software bug tracking system; software development maintenance activity; surface feature; textual semantics; Computer bugs; Costs; Filtering; Open source software; Operating systems; Software maintenance; Software quality; Software systems; Software tools; Spatial databases;
fLanguage
English
Publisher
ieee
Conference_Titel
Dependable Systems and Networks With FTCS and DCC, 2008. DSN 2008. IEEE International Conference on
Conference_Location
Anchorage, AK
Print_ISBN
978-1-4244-2397-2
Electronic_ISBN
978-1-4244-2398-9
Type
conf
DOI
10.1109/DSN.2008.4630070
Filename
4630070
Link To Document