• DocumentCode
    70854
  • Title

    Temporal Properties of Diagnosis Code Time Series in Aggregate

  • Author

    Perotte, A. ; Hripcsak, G.

  • Author_Institution
    Dept. of Biomed. Inf., Columbia Univ., New York, NY, USA
  • Volume
    17
  • Issue
    2
  • fYear
    2013
  • fDate
    Mar-13
  • Firstpage
    477
  • Lastpage
    483
  • Abstract
    Time series are essential to health data research and data mining. We aim to study the properties of one of the more commonly available but historically unreliable types of data: administrative diagnoses in the form of the International Classification of Diseases, Ninth Revision (ICD9) codes. We use differential entropy of ICD9 code time series as a surrogate measure for disease time course and also explore Gaussian kernel smoothing to characterize the time course of diseases in a more fine-grained way. Compared to a gold standard created by a panel of clinicians, the first model classified diseases into acute and chronic groups with a receiver operating characteristic area under curve of 0.83. In the second model, several characteristic temporal profiles were observed including permanent, chronic, and acute. In addition, condition dynamics such as the refractory period for giving birth following childbirth were observed. These models demonstrate that ICD9 codes, despite well-documented concerns, contain valid and potentially valuable temporal information.
  • Keywords
    data mining; diseases; medical information systems; patient diagnosis; Gaussian kernel smoothing; ICD9 code time series differential entropy; ICD9 codes; International Classification of Diseases Ninth Revision; ROC area under curve; acute diseases; administrative diagnoses; chronic diseases; data mining; diagnosis code time series temporal properties; disease time course surrogate measure; health data research; receiver operating characteristic; temporal information; Diabetes; Diseases; Documentation; Entropy; Sociology; Time series analysis; Biomedical informatics; clinical diagnosis; medical information systems; time series analysis; Data Mining; Electronic Health Records; Entropy; Female; Humans; International Classification of Diseases; Male; Medical Informatics Applications; ROC Curve; Time Factors;
  • fLanguage
    English
  • Journal_Title
    Biomedical and Health Informatics, IEEE Journal of
  • Publisher
    ieee
  • ISSN
    2168-2194
  • Type

    jour

  • DOI
    10.1109/JBHI.2013.2244610
  • Filename
    6471160