• DocumentCode
    2577993
  • Title

    Automatic Defect Categorization

  • Author

    Thung, Ferdian ; Lo, David ; Jiang, Lingxiao

  • Author_Institution
    Sch. of Inf. Syst., Singapore Manage. Univ., Singapore, Singapore
  • fYear
    2012
  • fDate
    15-18 Oct. 2012
  • Firstpage
    205
  • Lastpage
    214
  • Abstract
    Defects are prevalent in software systems. In order to understand defects better, industry practitioners often categorize bugs into various types. One common kind of categorization is the IBM´s Orthogonal Defect Classification (ODC). ODC proposes various orthogonal classification of defects based on much information about the defects, such as the symptoms and semantics of the defects, the root cause analysis of the defects, and many more. With these category labels, developers can better perform post-mortem analysis to find out what the common characteristics of the defects that plague a particular software project are. Albeit the benefits of having these categories, for many software systems, these category labels are often missing. To address this problem, we propose a text mining solution that can categorize defects into various types by analyzing both texts from bug reports and code features from bug fixes. To this end, we have manually analyzed the data about 500 defects from three software systems, and classified them according to ODC. In addition, we propose a classification-based approach that can automatically classify defects into three super-categories that are comprised of ODC categories: control and data flow, structural, and non-functional. Our empirical evaluation shows that the automatic classification approach is able to label defects with an average accuracy of 77.8% by using the SVM multiclass classification algorithm.
  • Keywords
    data mining; pattern classification; program debugging; program diagnostics; support vector machines; IBM orthogonal defect classification; SVM multiclass classification algorithm; automatic classification approach; automatic defect categorization; bug reports; code features; defect semantics; industry practitioners; post-mortem analysis; software projects; software system defects; text mining solution; Reverse engineering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Reverse Engineering (WCRE), 2012 19th Working Conference on
  • Conference_Location
    Kingston, ON
  • ISSN
    1095-1350
  • Print_ISBN
    978-1-4673-4536-1
  • Type

    conf

  • DOI
    10.1109/WCRE.2012.30
  • Filename
    6385116