• DocumentCode
    1758035
  • Title

    Replicating and Re-Evaluating the Theory of Relative Defect-Proneness

  • Author

    Syer, Mark D. ; Nagappan, Meiyappan ; Adams, Bram ; Hassan, Ahmed E.

  • Author_Institution
    Sch. of Comput., Queen´s Univ., Kingston, ON, Canada
  • Volume
    41
  • Issue
    2
  • fYear
    2015
  • fDate
    Feb. 1 2015
  • Firstpage
    176
  • Lastpage
    197
  • Abstract
    A good understanding of the factors impacting defects in software systems is essential for software practitioners, because it helps them prioritize quality improvement efforts (e.g., testing and code reviews). Defect prediction models are typically built using classification or regression analysis on product and/or process metrics collected at a single point in time (e.g., a release date). However, current defect prediction models only predict if a defect will occur, but not when, which makes the prioritization of software quality improvements efforts difficult. To address this problem, Koru et al. applied survival analysis techniques to a large number of software systems to study how size (i.e., lines of code) influences the probability that a source code module (e.g., class or file) will experience a defect at any given time. Given that 1) the work of Koru et al. has been instrumental to our understanding of the size-defect relationship, 2) the use of survival analysis in the context of defect modelling has not been well studied and 3) replication studies are an important component of balanced scholarly debate, we present a replication study of the work by Koru et al. In particular, we present the details necessary to use survival analysis in the context of defect modelling (such details were missing from the original paper by Koru et al.). We also explore how differences between the traditional domains of survival analysis (i.e., medicine and epidemiology) and defect modelling impact our understanding of the size-defect relationship. Practitioners and researchers considering the use of survival analysis should be aware of the implications of our findings.
  • Keywords
    program diagnostics; software quality; software reliability; defect modelling; relative defect-proneness theory; size-defect relationship; software system defects; source code module; survival analysis techniques; Analytical models; Data models; Hazards; Mathematical model; Measurement; Predictive models; Software; Cox Models; Cox models; Defect Modelling; Survival Analysis; Survival analysis; defect modelling;
  • fLanguage
    English
  • Journal_Title
    Software Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0098-5589
  • Type

    jour

  • DOI
    10.1109/TSE.2014.2361131
  • Filename
    6914599