• DocumentCode
    3227428
  • Title

    Question answering for biology and medicine

  • Author

    Gobeil, J. ; Patsche, E. ; Theodoro, D. ; Veuthey, A.-L. ; Lovis, C. ; Ruch, P.

  • Author_Institution
    Inf. Studies Dept., Univ. of Appl. Sci., Geneva, Switzerland
  • fYear
    2009
  • fDate
    4-7 Nov. 2009
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    Biomedical professionals have at their disposal a huge amount of data, such as literature, i.e. textual contents, or databases, i.e. structured contents. But when they have a question, they often have to deal with too many documents in order to efficiently find the appropriate answer in a reasonable time. We have developed a Question Answering system which aims to analyze the user´s question, to retrieve the most relevant documents from MEDLINE, and to extract from these retrieved documents a list of candidate answers, ranked by confidence. These candidate answers are concepts issued from biomedical controlled vocabularies, such as the Medical Subject Headings (MeSH) for a first step, and are extracted from the most relevant documents with pattern matching strategies. For evaluation purposes, we apply the system on two biological databases, UniProt and DrugBank. From these resources, we generated two large benchmarks of 200 questions dealing respectively with diseases and proteins, and with diseases and drugs. For these 2 sets, the first candidate answer proposed by our system is respectively correct in 57% and in 68%, while respectively 70% and 75% of all answers to find are contained in the ten first proposed candidate answers. Despite the use of simple Information Extraction strategies, our system exploits the redundancy of information in literature in order to provide a powerful Question Answering system.
  • Keywords
    content-based retrieval; database management systems; diseases; drugs; information retrieval; medical information systems; ontologies (artificial intelligence); proteins; DrugBank; MEDLINE; Medical Subject Headings; UniProt; biological databases; biomedical controlled vocabularies; biomedical professionals; diseases; drugs; information extraction strategies; pattern matching; proteins; question answering system; structured contents; textual contents; Bioinformatics; Data mining; Databases; Diseases; Engines; Hospitals; Information retrieval; Medical control systems; Natural languages; Vocabulary; Information Extraction; Information Retrieval; Ontology; Question Answering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Technology and Applications in Biomedicine, 2009. ITAB 2009. 9th International Conference on
  • Conference_Location
    Larnaca
  • Print_ISBN
    978-1-4244-5379-5
  • Electronic_ISBN
    978-1-4244-5379-5
  • Type

    conf

  • DOI
    10.1109/ITAB.2009.5394361
  • Filename
    5394361