DocumentCode
2055263
Title
Querying the World Wide Web
Author
Mendelzon, Alberto O. ; Mihaila, George A. ; Milo, Tova
Author_Institution
Dept. of Comput. Sci., Toronto Univ., Ont., Canada
fYear
1996
fDate
18-20 Dec 1996
Firstpage
80
Lastpage
91
Abstract
The World Wide Web is a large, heterogeneous, distributed collection of documents connected by hypertext links. The most common technology currently used for searching the Web depends on sending information retrieval requests to “index servers”. One problem with this is that these queries cannot exploit the structure and topology of the document network. The authors propose a query language, WebSQL, that takes advantage of multiple index servers without requiring users to know about them, and that integrates textual retrieval with structure and topology-based queries. They give a formal semantics for WebSQL using a calculus based on a novel “virtual graph” model of a document network. They propose a new theory of query cost based on the idea of “query locality,” that is, how much of the network must be visited to answer a particular query. Finally, they describe a prototype implementation of WebSQL written in Java
Keywords
Internet; SQL; computational linguistics; hypermedia; information retrieval; network servers; process algebra; query processing; Java; Web searching; WebSQL query language; World Wide Web querying; calculus; document network; formal semantics; hypertext links; information retrieval requests; large heterogeneous distributed document collection; multiple index servers; query cost; query locality; textual retrieval; topology-based queries; virtual graph model; Calculus; Computer science; Costs; Database languages; Java; Navigation; Network servers; Network topology; Search engines; Web sites;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel and Distributed Information Systems, 1996., Fourth International Conference on
Conference_Location
Miami Beach, FL
Print_ISBN
0-8186-7475X
Type
conf
DOI
10.1109/PDIS.1996.568671
Filename
568671
Link To Document