• DocumentCode
    886955
  • Title

    Using machine learning for estimating the defect content after an inspection

  • Author

    Padberg, Frank ; Ragg, Thomas ; Schoknecht, Ralf

  • Author_Institution
    Karlsruhe Univ., Germany
  • Volume
    30
  • Issue
    1
  • fYear
    2004
  • Firstpage
    17
  • Lastpage
    28
  • Abstract
    We view the problem of estimating the defect content of a document after an inspection as a machine learning problem: The goal is to learn from empirical data the relationship between certain observable features of an inspection (such as the total number of different defects detected) and the number of defects actually contained in the document. We show that some features can carry significant nonlinear information about the defect content. Therefore, we use a nonlinear regression technique, neural networks, to solve the learning problem. To select the best among all neural networks trained on a given data set, one usually reserves part of the data set for later cross-validation; in contrast, we use a technique which leaves the full data set for training. This is an advantage when the data set is small. We validate our approach on a known empirical inspection data set. For that benchmark, our novel approach clearly outperforms both linear regression and the current standard methods in software engineering for estimating the defect content, such as capture-recapture. The validation also shows that our machine learning approach can be successful even when the empirical inspection data set is small.
  • Keywords
    learning (artificial intelligence); neural nets; program testing; program verification; regression analysis; defect content estimation; empirical methods; machine learning; neural network; nonlinear regression technique; program validation; software engineering; software inspection; Curve fitting; Estimation error; Inspection; Linear regression; Machine learning; Neural networks; Quality assurance; Software engineering; Software standards; Software testing;
  • fLanguage
    English
  • Journal_Title
    Software Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0098-5589
  • Type

    jour

  • DOI
    10.1109/TSE.2004.1265733
  • Filename
    1265733