Title :
Breast cancer staging using Natural Language Processing
Author :
Johanna Johnsi Rani G;Dennis Gladis;Marie Therese Manipadam;Gunadala Ishitha
Author_Institution :
Department of Computer Science, Madras Christian College, Chennai 600 059, South India
Abstract :
Medical diagnostic reports archived as electronic forms are valuable resources for processing to understand retrospectively, the severity of the disease among patients and to verify the correctness of the diagnosis. In this work, Breast Cancer Pathology reports are processed using Natural Language Processing (NLP) and Information Extraction (IE) techniques in order to extract the parameters required for cancer staging namely Tumour (T), Lymph nodes (N) and Metastases (M). An automated system is developed to process the `Impression´ section of the report, classify T and N using pTNM classification protocol of American Joint Committee on Cancer (AJCC) and derive the stage S of cancer of patients. T and N are classified using numerical parameters and non-numeric medical conditions given in the natural language text. Metastases M which is not evident from Pathology reports is given a default value of M0 for staging. The dataset consisting of 150 de-identified reports were reviewed by the Pathologists to obtain the Gold standard for evaluation. The TNM classification and the cancer stage derived by the system were evaluated against the Gold standard and discrepancy reports were generated. The extraction process was then fine-tuned based on the recommendations of the domain experts. The automatic staging process had an average percentage of 73 for Precision, 82 for Recall, 59 for Specificity and 72 for Accuracy. The lack of high performance is due to presence of certain vital information in other sections of the report that are not processed. Processing these sections in future would improve the performance.
Keywords :
"Pathology","Breast cancer","Medical diagnostic imaging","Tumors","Natural language processing","Standards"
Conference_Titel :
Advances in Computing, Communications and Informatics (ICACCI), 2015 International Conference on
Print_ISBN :
978-1-4799-8790-0
DOI :
10.1109/ICACCI.2015.7275834