• DocumentCode
    3549357
  • Title

    A missing data estimation analysis in type II diabetes databases

  • Author

    Giardina, M. ; Huo, Y. ; Azuaje, F. ; McCullagh, P. ; Harper, R.

  • Author_Institution
    Sch. of Comput. & Math., Ulster Univ., Jordanstown, UK
  • fYear
    2005
  • fDate
    23-24 June 2005
  • Firstpage
    347
  • Lastpage
    352
  • Abstract
    Type II diabetes is one of the most common causes of disability and death in the United Kingdom. This investigation analysed data acquired from diabetic patients at the Ulster Hospital in Northern Ireland in terms of statistical descriptive indicators and missing values. Such data are noisy and incomplete. This paper reports a comprehensive missing data estimation analysis. Five missing value imputation methods were compared, including k-Nearest Neighbours (k-NN) and correlation-based estimation models. From this analysis it can be concluded that a feature-based correlation method known as EMImpute/spl I.bar/Columns is a promising approach to estimating missing values. Nevertheless, k-NN methods may be useful to provide relatively accurate estimations with lower error variability. These estimation techniques will support the implementation of supervised and unsupervised learning tools for coronary heart disease risk assessment, a major complication of diabetes.
  • Keywords
    cardiology; data analysis; diseases; medical information systems; unsupervised learning; EMImpute/spl I.bar/Columns; Ulster Hospital; coronary heart disease risk assessment; correlation-based estimation model; k-Nearest Neighbours model; lower error variability; missing data estimation analysis; missing value imputation method; statistical descriptive indicators; supervised learning tool; type II diabetes database; unsupervised learning tool; Bioinformatics; Cardiac disease; Correlation; Data analysis; Data mining; Databases; Diabetes; Hospitals; Mathematics; Unsupervised learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer-Based Medical Systems, 2005. Proceedings. 18th IEEE Symposium on
  • Conference_Location
    Dublin
  • ISSN
    1063-7125
  • Print_ISBN
    0-7695-2355-2
  • Type

    conf

  • DOI
    10.1109/CBMS.2005.13
  • Filename
    1467714