• DocumentCode
    3658034
  • Title

    An Empirical Study of Dynamic Incomplete-Case Nearest Neighbor Imputation in Software Quality Data

  • Author

    Jianglin Huang;Hongyi Sun;Yan-Fu Li;Min Xie

  • Author_Institution
    Dept. of Syst. Eng. &
  • fYear
    2015
  • Firstpage
    37
  • Lastpage
    42
  • Abstract
    Software quality prediction is an important yet difficult problem in software project development and management. Historical datasets can be used to build models for software quality prediction. However, the missing data significantly affects the prediction ability of models in knowledge discovery. Instead of ignoring missing observations, we investigate and improve incomplete-case k-nearest neighbor based imputation. K-nearest neighbor imputation is widely applied but has rarely been improved to have the most appropriate parameter settings for each imputation. This work conducts imputation on four well-known software quality datasets to discover the impact of the new imputation method we proposed. We compare it with mean imputation and other commonly used versions of k-nearest neighbor imputation. The empirical results show that the proposed dynamic incomplete-case nearest neighbor imputation performs better when the missingness is completely at random or non-ignorable, regardless of the percentage of missing values.
  • Keywords
    "Software quality","Nickel","Software engineering","Measurement","Estimation","Predictive models"
  • Publisher
    ieee
  • Conference_Titel
    Software Quality, Reliability and Security (QRS), 2015 IEEE International Conference on
  • Type

    conf

  • DOI
    10.1109/QRS.2015.16
  • Filename
    7272912