DocumentCode
390718
Title
Querying Web data - the WebQA approach
Author
Lam, Sunny K S ; Ozu, M.T.
Author_Institution
Sch. of Comput. Sci., Waterloo Univ., Ont., Canada
fYear
2002
fDate
12-14 Dec. 2002
Firstpage
139
Lastpage
148
Abstract
The common paradigm of searching and retrieving information on the Web is based on keyword-based search using one or more search engines, then browsing through the large number of returned URLs. This is significantly weaker than declarative querying that is supported by DBMSs. The lack of a schema and high volatility of the Web make "database-like" querying of Web data difficult. We report on our work in building a system, called WebQA, that provides a declarative query-based approach to Web data retrieval that uses question-answering technology in extracting information from Web sites that are retrieved by search engines. The approach consists of first using meta-search techniques in an open environment to gather candidate responses from search engines and other on-line databases, then using information extraction techniques to find the answer to a specific question from these candidates. A prototype system has been developed to test this approach. Testing includes evaluation of its performance as a question-answering system using a well-known evaluation system called TREC-9. Its accuracy using TREC-9 data for simple questions is high and its retrieval performance is good. The system employs an open system architecture allowing for on-going improvements.
Keywords
Web sites; information retrieval; search engines; TREC-9; URLs; Web data querying; Web data retrieval; Web sites; WebQA Approach; browsing; candidate responses; declarative querying; information extraction techniques; information retrieval; information searching; keyword-based search; meta-search techniques; on-line databases; open environment; question-answering technology; search engines; Computer science; Data mining; Databases; Information retrieval; Metasearch; Open systems; Prototypes; Search engines; System testing; Uniform resource locators;
fLanguage
English
Publisher
ieee
Conference_Titel
Web Information Systems Engineering, 2002. WISE 2002. Proceedings of the Third International Conference on
Print_ISBN
0-7695-1766-8
Type
conf
DOI
10.1109/WISE.2002.1181651
Filename
1181651
Link To Document