Title :
Speeding Up QA: An Index Structure for Question Queries
Author :
Hu, Maidong ; Elamy, A.-H.
Author_Institution :
Univ. of Alberta, Edmonton
Abstract :
Scalability is a major disadvantage of Web question-answering systems (QA), which produces slow response and tedious search time. The bottleneck lies in the commercial search engine used in simple QA systems. In normal architecture of QA, after finding the related documents in the text corpus, analysis of these documents and retrieval of the answers will cost much time. The reason is that the traditional inverted index used in the commercial search engine is not optimal for the QA systems. One of the solutions is to index the position of answers of QA systems directly in the corpus, not index only the single meaningless words. Thus, the response time of the improved search engine to QA queries will be expected to be almost at the same level as the commercial search engines to searching issued by users. In this paper, we will propose a new inverted index structure for QA systems. By indexing the possible meaningful phrases relative to the position of items (words), our approach can improve the response time without losing the advantages of the inverted index. Due to the larger and larger cheap amount of storing space available in nowadays computers, the extra space used by the approach may be regarded neglectable. Thus, our approach can be used in any large-scale QA system, which always produces enormous quantity of possible answers.
Keywords :
Internet; indexing; query processing; search engines; QA system; Web question-answering systems; inverted index structure; question query; search engine; search time; Computer architecture; Costs; Data engineering; Delay; Indexing; Large-scale systems; Natural languages; Scalability; Search engines; Text analysis;
Conference_Titel :
Electrical and Computer Engineering, 2007. CCECE 2007. Canadian Conference on
Conference_Location :
Vancouver, BC
Print_ISBN :
1-4244-1020-7
Electronic_ISBN :
0840-7789
DOI :
10.1109/CCECE.2007.310