• DocumentCode
    3562091
  • Title

    Data preprocessing and mortality prediction: The Physionet/CinC 2012 challenge revisited

  • Author

    Johnson, Alistair Ew ; Kramer, Andrew A. ; Clifford, Gari D.

  • Author_Institution
    Univ. of Oxford, Oxford, UK
  • fYear
    2014
  • Firstpage
    157
  • Lastpage
    160
  • Abstract
    The Physionet/CinC 2012 challenge focused on improving patient specific mortality predictions in the intensive care unit. While most of the focus in the challenge was on applying sophisticated machine learning algorithms, little attention was paid to the preprocessing performed on the data a priori. We compare four standard pre-processing methods with a novel Box-Cox outlier rejection technique and analyze their effect on machine learning classifiers for predicting the mortality of ICU patients. The best machine learning model utilized the proposed preprocessing method and achieved an AUROC of 0.848. In general, the AUROC of models using our novel preprocessing method increased, and this increase was as much as 0.02 in some cases. Furthermore, the use of preprocessing improved the performance of regression models to a higher level than that of non-linear techniques such as random forests. We demonstrate that proper preprocessing of the data prior to use in a prognostic model can significantly improve performance. This improvement can be even greater than that provided by more complex non-linear machine learning algorithms.
  • Keywords
    cardiology; data mining; learning (artificial intelligence); medical computing; regression analysis; AUROC; Box-Cox outlier rejection technique; ICU patient mortality prediction; Physionet-CinC 2012 challenge; area under the receiver operating characteristic; complex nonlinear machine learning algorithms; computing in cardiology; intensive care unit; machine learning classifiers; prognostic model; random forests; regression model performance; sophisticated machine learning algorithms; standard pre-processing methods; Data models; Data preprocessing; Feature extraction; Heart rate; Predictive models; Support vector machines; Training;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computing in Cardiology Conference (CinC), 2014
  • ISSN
    2325-8861
  • Print_ISBN
    978-1-4799-4346-3
  • Type

    conf

  • Filename
    7043003