DocumentCode
886955
Title
Using machine learning for estimating the defect content after an inspection
Author
Padberg, Frank ; Ragg, Thomas ; Schoknecht, Ralf
Author_Institution
Karlsruhe Univ., Germany
Volume
30
Issue
1
fYear
2004
Firstpage
17
Lastpage
28
Abstract
We view the problem of estimating the defect content of a document after an inspection as a machine learning problem: The goal is to learn from empirical data the relationship between certain observable features of an inspection (such as the total number of different defects detected) and the number of defects actually contained in the document. We show that some features can carry significant nonlinear information about the defect content. Therefore, we use a nonlinear regression technique, neural networks, to solve the learning problem. To select the best among all neural networks trained on a given data set, one usually reserves part of the data set for later cross-validation; in contrast, we use a technique which leaves the full data set for training. This is an advantage when the data set is small. We validate our approach on a known empirical inspection data set. For that benchmark, our novel approach clearly outperforms both linear regression and the current standard methods in software engineering for estimating the defect content, such as capture-recapture. The validation also shows that our machine learning approach can be successful even when the empirical inspection data set is small.
Keywords
learning (artificial intelligence); neural nets; program testing; program verification; regression analysis; defect content estimation; empirical methods; machine learning; neural network; nonlinear regression technique; program validation; software engineering; software inspection; Curve fitting; Estimation error; Inspection; Linear regression; Machine learning; Neural networks; Quality assurance; Software engineering; Software standards; Software testing;
fLanguage
English
Journal_Title
Software Engineering, IEEE Transactions on
Publisher
ieee
ISSN
0098-5589
Type
jour
DOI
10.1109/TSE.2004.1265733
Filename
1265733
Link To Document