DocumentCode
3549357
Title
A missing data estimation analysis in type II diabetes databases
Author
Giardina, M. ; Huo, Y. ; Azuaje, F. ; McCullagh, P. ; Harper, R.
Author_Institution
Sch. of Comput. & Math., Ulster Univ., Jordanstown, UK
fYear
2005
fDate
23-24 June 2005
Firstpage
347
Lastpage
352
Abstract
Type II diabetes is one of the most common causes of disability and death in the United Kingdom. This investigation analysed data acquired from diabetic patients at the Ulster Hospital in Northern Ireland in terms of statistical descriptive indicators and missing values. Such data are noisy and incomplete. This paper reports a comprehensive missing data estimation analysis. Five missing value imputation methods were compared, including k-Nearest Neighbours (k-NN) and correlation-based estimation models. From this analysis it can be concluded that a feature-based correlation method known as EMImpute/spl I.bar/Columns is a promising approach to estimating missing values. Nevertheless, k-NN methods may be useful to provide relatively accurate estimations with lower error variability. These estimation techniques will support the implementation of supervised and unsupervised learning tools for coronary heart disease risk assessment, a major complication of diabetes.
Keywords
cardiology; data analysis; diseases; medical information systems; unsupervised learning; EMImpute/spl I.bar/Columns; Ulster Hospital; coronary heart disease risk assessment; correlation-based estimation model; k-Nearest Neighbours model; lower error variability; missing data estimation analysis; missing value imputation method; statistical descriptive indicators; supervised learning tool; type II diabetes database; unsupervised learning tool; Bioinformatics; Cardiac disease; Correlation; Data analysis; Data mining; Databases; Diabetes; Hospitals; Mathematics; Unsupervised learning;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer-Based Medical Systems, 2005. Proceedings. 18th IEEE Symposium on
Conference_Location
Dublin
ISSN
1063-7125
Print_ISBN
0-7695-2355-2
Type
conf
DOI
10.1109/CBMS.2005.13
Filename
1467714
Link To Document