DocumentCode
2398122
Title
A General Purpose Phenotype Algorithm for Venous Thromboembolism Using Billing Codes and Natural Language Processing
Author
Hinz, Eugenia R McPeek ; Bastarache, Lisa ; Denny, Joshua C.
Author_Institution
Dept. of Biomed. Inf., Vanderbilt Univ., Nashville, TN, USA
fYear
2012
fDate
27-28 Sept. 2012
Firstpage
149
Lastpage
149
Abstract
Deep venous thrombosis and pulmonary embolism are diseases associated with significant morbidity and mortality. Well described risk factors for venous thromboembolic disease (VTE) include immobility, trauma and genetic hypercoagulabilty states, still many cases have no known associated antecedent risks. Studies to potentially define the missing risk factors preferably identify all cases of VTE. Defining VTE in the electronic health record is more challenging due to the variable duration of VTE treatment, crossover of therapeutic modalities to other chronic diseases and prevention treatment related to hospitalizations. We designed a general purpose Natural Language (NLP) algorithm to capture acute and historical cases of thromboembolic disease retrospectively in a de-identified electronic health record. Applying the NLP algorithm to a separate evaluation set found a positive predictive value of 84.7% and sensitivity of 95.3% for an F-measure of 0.897, which was similar to the training set of 0.925. Use of the same algorithm on problem lists in patients without VTE ICD-9s resulted in a PPV of 83%. NLP of VTE ICD-9 positive cases and non-ICD-9 positive problem lists provides an effective means for capture of both acute and historical cases of venous thromboembolic disease.
Keywords
cancer; injuries; medical information systems; natural language processing; patient treatment; risk analysis; F-measure; NLP algorithm; VTE treatment; billing codes; cancer; chronic diseases; deep venous thrombosis; electronic health record; general purpose phenotype algorithm; heritable hypercoagulabilty state; hospitalization; immobility; morbidity; mortality; natural language processing; positive predictive value; pulmonary embolism; risk factors; sensitivity; therapeutic modality; trauma; venous thromboembolic disease; Diseases; Educational institutions; Informatics; Natural language processing; Prediction algorithms; Sensitivity; Training;
fLanguage
English
Publisher
ieee
Conference_Titel
Healthcare Informatics, Imaging and Systems Biology (HISB), 2012 IEEE Second International Conference on
Conference_Location
San Diego, CA
Print_ISBN
978-1-4673-4803-4
Type
conf
DOI
10.1109/HISB.2012.74
Filename
6366176
Link To Document