DocumentCode
598670
Title
Predicting disease by using data mining based on healthcare information system
Author
Huang, Feixiang ; Wang, Shengyong ; Chan, Chien-Chung
Author_Institution
University of Akron, OH 44325, USA
fYear
2012
fDate
11-13 Aug. 2012
Firstpage
191
Lastpage
194
Abstract
This paper applies the data mining process to predict hypertension from patient medical records with eight other diseases. A sample with the size of 9862 cases has been studied. The sample was extracted from a real world Healthcare Information System database containing 309383 medical records. We observed that the distribution of patient diseases in the medical database is imbalanced. Under-sampling technique has been applied to generate training data sets, and data mining tool Weka has been used to generate the Naïve Bayesian and J-48 classifiers. In addition, an ensemble of five J-48 classifiers was created trying to improve the prediction performance, and rough set tools were used to reduce the ensemble based on the idea of second-order approximation. Experimental results showed a little improvement of the ensemble approach over pure Naïve Bayesian and J-48 in accuracy, sensitivity, and F-measure.
Keywords
Accuracy; Area measurement; Bayesian methods; Humans; Immune system; Niobium; Sensitivity;
fLanguage
English
Publisher
ieee
Conference_Titel
Granular Computing (GrC), 2012 IEEE International Conference on
Conference_Location
Hangzhou, China
Print_ISBN
978-1-4673-2310-9
Type
conf
DOI
10.1109/GrC.2012.6468691
Filename
6468691
Link To Document