• DocumentCode
    598670
  • Title

    Predicting disease by using data mining based on healthcare information system

  • Author

    Huang, Feixiang ; Wang, Shengyong ; Chan, Chien-Chung

  • Author_Institution
    University of Akron, OH 44325, USA
  • fYear
    2012
  • fDate
    11-13 Aug. 2012
  • Firstpage
    191
  • Lastpage
    194
  • Abstract
    This paper applies the data mining process to predict hypertension from patient medical records with eight other diseases. A sample with the size of 9862 cases has been studied. The sample was extracted from a real world Healthcare Information System database containing 309383 medical records. We observed that the distribution of patient diseases in the medical database is imbalanced. Under-sampling technique has been applied to generate training data sets, and data mining tool Weka has been used to generate the Naïve Bayesian and J-48 classifiers. In addition, an ensemble of five J-48 classifiers was created trying to improve the prediction performance, and rough set tools were used to reduce the ensemble based on the idea of second-order approximation. Experimental results showed a little improvement of the ensemble approach over pure Naïve Bayesian and J-48 in accuracy, sensitivity, and F-measure.
  • Keywords
    Accuracy; Area measurement; Bayesian methods; Humans; Immune system; Niobium; Sensitivity;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Granular Computing (GrC), 2012 IEEE International Conference on
  • Conference_Location
    Hangzhou, China
  • Print_ISBN
    978-1-4673-2310-9
  • Type

    conf

  • DOI
    10.1109/GrC.2012.6468691
  • Filename
    6468691