• DocumentCode
    2897460
  • Title

    Automated duplicate detection for bug tracking systems

  • Author

    Jalbert, Nicholas ; Weimer, Westley

  • Author_Institution
    Univ. of Virginia, Charlottesville, VA
  • fYear
    2008
  • fDate
    24-27 June 2008
  • Firstpage
    52
  • Lastpage
    61
  • Abstract
    Bug tracking systems are important tools that guide the maintenance activities of software developers. The utility of these systems is hampered by an excessive number of duplicate bug reports-in some projects as many as a quarter of all reports are duplicates. Developers must manually identify duplicate bug reports, but this identification process is time-consuming and exacerbates the already high cost of software maintenance. We propose a system that automatically classifies duplicate bug reports as they arrive to save developer time. This system uses surface features, textual semantics, and graph clustering to predict duplicate status. Using a dataset of 29,000 bug reports from the Mozilla project, we perform experiments that include a simulation of a real-time bug reporting environment. Our system is able to reduce development cost by filtering out 8% of duplicate bug reports while allowing at least one report for each real defect to reach developers.
  • Keywords
    graph theory; pattern classification; pattern clustering; program debugging; software maintenance; software tools; tracking; automated duplicate bug report detection; graph clustering; software bug tracking system; software development maintenance activity; surface feature; textual semantics; Computer bugs; Costs; Filtering; Open source software; Operating systems; Software maintenance; Software quality; Software systems; Software tools; Spatial databases;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Dependable Systems and Networks With FTCS and DCC, 2008. DSN 2008. IEEE International Conference on
  • Conference_Location
    Anchorage, AK
  • Print_ISBN
    978-1-4244-2397-2
  • Electronic_ISBN
    978-1-4244-2398-9
  • Type

    conf

  • DOI
    10.1109/DSN.2008.4630070
  • Filename
    4630070