• DocumentCode
    2398122
  • Title

    A General Purpose Phenotype Algorithm for Venous Thromboembolism Using Billing Codes and Natural Language Processing

  • Author

    Hinz, Eugenia R McPeek ; Bastarache, Lisa ; Denny, Joshua C.

  • Author_Institution
    Dept. of Biomed. Inf., Vanderbilt Univ., Nashville, TN, USA
  • fYear
    2012
  • fDate
    27-28 Sept. 2012
  • Firstpage
    149
  • Lastpage
    149
  • Abstract
    Deep venous thrombosis and pulmonary embolism are diseases associated with significant morbidity and mortality. Well described risk factors for venous thromboembolic disease (VTE) include immobility, trauma and genetic hypercoagulabilty states, still many cases have no known associated antecedent risks. Studies to potentially define the missing risk factors preferably identify all cases of VTE. Defining VTE in the electronic health record is more challenging due to the variable duration of VTE treatment, crossover of therapeutic modalities to other chronic diseases and prevention treatment related to hospitalizations. We designed a general purpose Natural Language (NLP) algorithm to capture acute and historical cases of thromboembolic disease retrospectively in a de-identified electronic health record. Applying the NLP algorithm to a separate evaluation set found a positive predictive value of 84.7% and sensitivity of 95.3% for an F-measure of 0.897, which was similar to the training set of 0.925. Use of the same algorithm on problem lists in patients without VTE ICD-9s resulted in a PPV of 83%. NLP of VTE ICD-9 positive cases and non-ICD-9 positive problem lists provides an effective means for capture of both acute and historical cases of venous thromboembolic disease.
  • Keywords
    cancer; injuries; medical information systems; natural language processing; patient treatment; risk analysis; F-measure; NLP algorithm; VTE treatment; billing codes; cancer; chronic diseases; deep venous thrombosis; electronic health record; general purpose phenotype algorithm; heritable hypercoagulabilty state; hospitalization; immobility; morbidity; mortality; natural language processing; positive predictive value; pulmonary embolism; risk factors; sensitivity; therapeutic modality; trauma; venous thromboembolic disease; Diseases; Educational institutions; Informatics; Natural language processing; Prediction algorithms; Sensitivity; Training;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Healthcare Informatics, Imaging and Systems Biology (HISB), 2012 IEEE Second International Conference on
  • Conference_Location
    San Diego, CA
  • Print_ISBN
    978-1-4673-4803-4
  • Type

    conf

  • DOI
    10.1109/HISB.2012.74
  • Filename
    6366176