• DocumentCode
    1653807
  • Title

    Selecting Answers to Questions from Web Documents by a Robust Validation Process

  • Author

    Grappy, A. ; Grau, B. ; Falco, M.-H. ; Ligozat, A.-L. ; Robba, I. ; Vilnat, A.

  • Author_Institution
    LIMSI, Univ. Paris-Sud, Orsay, France
  • Volume
    1
  • fYear
    2011
  • Firstpage
    55
  • Lastpage
    62
  • Abstract
    Question answering (QA) systems aim at finding answers to question posed in natural language using a collection of documents. When the collection is extracted from the Web, the structure and style of the texts are quite different from those of newspaper articles. We developed a QA system based on an answer validation process able to handle Web specificity. A large number of candidate answers are extracted from short passages in order to be validated according to question and passages characteristics. The validation module is based on a machine learning approach. It takes into account criteria characterizing both passage and answer relevance at surface, lexical, syntactic and semantic levels to deal with different types of texts. We present and compare results obtained for factual questions posed on a Web and on a newspaper collection. We show that our system outperforms a baseline by up to 48% in MRR.
  • Keywords
    Internet; feature extraction; learning (artificial intelligence); natural language processing; program verification; question answering (information retrieval); text analysis; QA system; Web document; answer selection; document collection; machine learning; natural language; newspaper article; question answering system; robust validation process; semantic level; Data mining; Feature extraction; HTML; Reliability; Semantics; Springs; Syntactics; Web document analysis; answer validation; fine-grained information retrieval; question-answering system;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence and Intelligent Agent Technology (WI-IAT), 2011 IEEE/WIC/ACM International Conference on
  • Conference_Location
    Lyon
  • Print_ISBN
    978-1-4577-1373-6
  • Electronic_ISBN
    978-0-7695-4513-4
  • Type

    conf

  • DOI
    10.1109/WI-IAT.2011.210
  • Filename
    6040496