DocumentCode :
866711
Title :
Efficient queries over Web views
Author :
Mecca, Giansalvatore ; Mendelzon, Alberto O. ; Merialdo, Paolo
Author_Institution :
Universita della Basilicata, Italy
Volume :
14
Issue :
6
fYear :
2002
Firstpage :
1280
Lastpage :
1298
Abstract :
Large Web sites are becoming repositories of structured information that can benefit from being viewed and queried as relational databases. However, querying these views efficiently requires new techniques. Data usually resides at a remote site and is organized as a set of related HTML documents, with network access being a primary cost factor in query evaluation. This cost can be reduced by exploiting the redundancy often found in site design. We use a simple data model, a subset of the Araneus data model, to describe the structure of a Web site. We augment the model with link and inclusion constraints that capture the redundancies in the site. We map relational views of a site to a navigational algebra and show how to use the constraints to rewrite algebraic expressions, reducing the number of network accesses. We show that similar techniques can be used to maintain materialized views over sets of HTML pages.
Keywords :
Web sites; data mining; data models; hypermedia markup languages; query processing; relational algebra; relational databases; Araneus; HTML documents; Internet; Web sites; Web views; algebraic expressions; data mining; data model; materialized views; navigational algebra; query evaluation; query languages; relational databases; Algebra; Bibliographies; Costs; Data models; Database languages; HTML; Information retrieval; Navigation; Query processing; Relational databases;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2002.1047768
Filename :
1047768
Link To Document :
بازگشت