• DocumentCode
    52589
  • Title

    Automatic Summarization of Bug Reports

  • Author

    Rastkar, Sarah ; Murphy, Gail C. ; Murray, Glen

  • Author_Institution
    Dept. of Comput. Sci., Univ. of British Columbia, Vancouver, BC, Canada
  • Volume
    40
  • Issue
    4
  • fYear
    2014
  • fDate
    Apr-14
  • Firstpage
    366
  • Lastpage
    380
  • Abstract
    Software developers access bug reports in a project´s bug repository to help with a number of different tasks, including understanding how previous changes have been made and understanding multiple aspects of particular defects. A developer´s interaction with existing bug reports often requires perusing a substantial amount of text. In this article, we investigate whether it is possible to summarize bug reports automatically so that developers can perform their tasks by consulting shorter summaries instead of entire bug reports. We investigated whether existing conversation-based automated summarizers are applicable to bug reports and found that the quality of generated summaries is similar to summaries produced for e-mail threads and other conversations. We also trained a summarizer on a bug report corpus. This summarizer produces summaries that are statistically better than summaries produced by existing conversation-based generators. To determine if automatically produced bug report summaries can help a developer with their work, we conducted a task-based evaluation that considered the use of summaries for bug report duplicate detection tasks. We found that summaries helped the study participants save time, that there was no evidence that accuracy degraded when summaries were used and that most participants preferred working with summaries to working with original bug reports.
  • Keywords
    electronic mail; program debugging; software engineering; automatic summarization; bug report corpus; bug report duplicate detection tasks; bug report summaries; bug reports; bug repository; conversation-based automated summarizers; conversation-based generators; e-mail threads; software developers; task-based evaluation; Computer bugs; Detectors; Electronic mail; Feature extraction; Natural languages; Software; Empirical software engineering; bug report duplicate detection; summarization of software artifacts;
  • fLanguage
    English
  • Journal_Title
    Software Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0098-5589
  • Type

    jour

  • DOI
    10.1109/TSE.2013.2297712
  • Filename
    6704866