شماره ركورد كنفرانس :
3297
عنوان مقاله :
A Proposed Open-Domain Factoid Question Answering System for Noisy and Ambiguous Knowledge-Bases
پديدآورندگان :
Mazloomzadeh Iren Department of Computer Science & Engineering & IT - Shiraz University , Fakhrahmad Seyed Mostafa Department of Computer Science & Engineering & IT - Shiraz University , Sadreddini Mohammad Hadi Department of Computer Science & Engineering & IT - Shiraz University
كليدواژه :
answer extraction , question answering , information extraction , database or knowledge Base
عنوان كنفرانس :
نوزدهمين سمپوزيوم بين المللي هوش مصنوعي و پردازش سيگنال
چكيده لاتين :
Most advanced web search engines help users to retrieve
relevant web pages for their questions in fraction of seconds. But
in many case, users have to find an exact and short answer for their
questions. In this situation we need an open Question answering
(QA) system. QA problem is still considered as a challenging
problem. This paper proposes a factoid open domain QA system
on noisy knowledge bases which are extracted from the web,
automatically. These noisy knowledge bases contain the
extractions which are noisy (e.g., the string "Obama", "Barack
Obama" and "president Obama" all appear as a distinct entities)
and ambiguous (e.g., the relation "born in" contains facts about
both date). We use noisy REVERB database contains a large crosssection
of world knowledge and is a good testbed for developing an
open domain QA system, as well as a new database which is
created for any given question based on web, because the web is
known as an attractive resource of knowledge for seeking the
answer of questions. The proposed system combines PARALEX
parser with natural language processing techniques and
normalization methods (like removing adverbs and fixing
misspelled words in questions) and creates a hybrid system which
has better performance than PARALEX- a well-known open
question answering system on noisy knowledge base. The proposed
approach applies a similarity measure to improve the efficiency of
the answer extraction method. Experimental results have shown
that the proposed scheme significantly outperforms PARALEX, in
terms of recall (14% better), precision (7.5 % better), f-measure
(13 % better), and MRR (15% better).