• DocumentCode
    894791
  • Title

    Count Models for Software Quality Estimation

  • Author

    Khoshgoftaar, Taghi M. ; Gao, Kehan

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Florida Atlantic Univ., Boca Raton, FL
  • Volume
    56
  • Issue
    2
  • fYear
    2007
  • fDate
    6/1/2007 12:00:00 AM
  • Firstpage
    212
  • Lastpage
    222
  • Abstract
    Identifying which software modules, during the software development process, are likely to be faulty is an effective technique for improving software quality. Such an approach allows a more focused software quality & reliability enhancement endeavor. The development team may also like to know the number of faults that are likely to exist in a given program module, i.e., a quantitative quality prediction. However, classification techniques such as the logistic regression model (lrm) cannot be used to predict the number of faults. In contrast, count models such as the Poisson regression model (prm), and the zero-inflated Poisson (zip) regression model can be used to obtain both a qualitative classification, and a quantitative prediction for software quality. In the case of the classification models, a classification rule based on our previously developed generalized classification rule is used. In the context of count models, this study is the first to propose a generalized classification rule. Case studies of two industrial software systems are examined, and for each we developed two count models, (prm, and zip), and a classification model (lrm). Evaluating the predictive capabilities of the models, we concluded that the prm, and the zip models have similar classification accuracies as the lrm. The count models are also used to predict the number of faults for the two case studies. The zip model yielded better fault prediction accuracy than the prm. As compared to other quantitative prediction models for software quality, such as multiple linear regression (mlr), the prm, and zip models have a unique property of yielding the probability that a given number of faults will occur in any module
  • Keywords
    principal component analysis; regression analysis; software metrics; software quality; software reliability; stochastic processes; Poisson regression model; fault probability; logistic regression model; multiple linear regression; principal component analysis; software development process; software metrics; software quality classification; software quality estimation count model; software quality prediction model; software reliability enhancement; zero-inflated Poisson regression model; Accuracy; Computer industry; Context modeling; Fault diagnosis; Linear regression; Logistics; Predictive models; Programming; Software quality; Software systems; Logistic regression; Poisson regression; principal components analysis; software fault prediction; software metrics; software quality classification; zero-inflated Poisson;
  • fLanguage
    English
  • Journal_Title
    Reliability, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9529
  • Type

    jour

  • DOI
    10.1109/TR.2007.896757
  • Filename
    4220787