Title :
Identifying Disease Definitions with a Correlation Kernel for Symptom Extractions from Text
Author :
Minsu Ko ; Sung-Hyon Myaeng
Author_Institution :
Div. of Web Sci. & Technol., Korea Adv. Inst. of Sci. & Technol., Daejeon, South Korea
Abstract :
Since most health-related knowledge is created by experts, it is not easy for general public to access, understand, and utilize such knowledge in daily living. It would be most convenient and useful to a healthcare knowledge base that a user can easily start exploring from symptoms and arrive at candidate diseases and eventually obtain knowledge for treatment and prevention. We have embarked on a project whose goal is to build such a healthcare knowledge base from text by using natural language processing and text mining techniques. This paper focuses on how definition sentences can be detected and describes a method of ranking sentences based on the degree to which they contain definitions of diseases, which should contain symptom information. While our work is basically to build a classifier that identifies definition sentences, the main contribution lies in the development of a new kernel method that utilizes correlations among different types of tokens. We evaluated our method to arrive at a conclusion that the proposed method can be very effective with a training data that is almost an order of magnitude smaller than the method of using dependency parser.
Keywords :
data mining; diseases; grammars; health care; knowledge based systems; natural language processing; text analysis; candidate disease; correlation kernel; daily living; definition sentence; dependency parser; disease definition; health-related knowledge; healthcare knowledge base; natural language processing; ranking sentence; symptom extraction; symptom information; text mining technique; training data; Correlation; Diseases; Kernel; Support vector machines; Syntactics; Training; colligation; collocation; correlation kernel; definition sentence; symptom extraction;
Conference_Titel :
Healthcare Informatics (ICHI), 2014 IEEE International Conference on
DOI :
10.1109/ICHI.2014.50