DocumentCode :
3227428
Title :
Question answering for biology and medicine
Author :
Gobeil, J. ; Patsche, E. ; Theodoro, D. ; Veuthey, A.-L. ; Lovis, C. ; Ruch, P.
Author_Institution :
Inf. Studies Dept., Univ. of Appl. Sci., Geneva, Switzerland
fYear :
2009
fDate :
4-7 Nov. 2009
Firstpage :
1
Lastpage :
5
Abstract :
Biomedical professionals have at their disposal a huge amount of data, such as literature, i.e. textual contents, or databases, i.e. structured contents. But when they have a question, they often have to deal with too many documents in order to efficiently find the appropriate answer in a reasonable time. We have developed a Question Answering system which aims to analyze the user´s question, to retrieve the most relevant documents from MEDLINE, and to extract from these retrieved documents a list of candidate answers, ranked by confidence. These candidate answers are concepts issued from biomedical controlled vocabularies, such as the Medical Subject Headings (MeSH) for a first step, and are extracted from the most relevant documents with pattern matching strategies. For evaluation purposes, we apply the system on two biological databases, UniProt and DrugBank. From these resources, we generated two large benchmarks of 200 questions dealing respectively with diseases and proteins, and with diseases and drugs. For these 2 sets, the first candidate answer proposed by our system is respectively correct in 57% and in 68%, while respectively 70% and 75% of all answers to find are contained in the ten first proposed candidate answers. Despite the use of simple Information Extraction strategies, our system exploits the redundancy of information in literature in order to provide a powerful Question Answering system.
Keywords :
content-based retrieval; database management systems; diseases; drugs; information retrieval; medical information systems; ontologies (artificial intelligence); proteins; DrugBank; MEDLINE; Medical Subject Headings; UniProt; biological databases; biomedical controlled vocabularies; biomedical professionals; diseases; drugs; information extraction strategies; pattern matching; proteins; question answering system; structured contents; textual contents; Bioinformatics; Data mining; Databases; Diseases; Engines; Hospitals; Information retrieval; Medical control systems; Natural languages; Vocabulary; Information Extraction; Information Retrieval; Ontology; Question Answering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Technology and Applications in Biomedicine, 2009. ITAB 2009. 9th International Conference on
Conference_Location :
Larnaca
Print_ISBN :
978-1-4244-5379-5
Electronic_ISBN :
978-1-4244-5379-5
Type :
conf
DOI :
10.1109/ITAB.2009.5394361
Filename :
5394361
Link To Document :
بازگشت