• DocumentCode
    866711
  • Title

    Efficient queries over Web views

  • Author

    Mecca, Giansalvatore ; Mendelzon, Alberto O. ; Merialdo, Paolo

  • Author_Institution
    Universita della Basilicata, Italy
  • Volume
    14
  • Issue
    6
  • fYear
    2002
  • Firstpage
    1280
  • Lastpage
    1298
  • Abstract
    Large Web sites are becoming repositories of structured information that can benefit from being viewed and queried as relational databases. However, querying these views efficiently requires new techniques. Data usually resides at a remote site and is organized as a set of related HTML documents, with network access being a primary cost factor in query evaluation. This cost can be reduced by exploiting the redundancy often found in site design. We use a simple data model, a subset of the Araneus data model, to describe the structure of a Web site. We augment the model with link and inclusion constraints that capture the redundancies in the site. We map relational views of a site to a navigational algebra and show how to use the constraints to rewrite algebraic expressions, reducing the number of network accesses. We show that similar techniques can be used to maintain materialized views over sets of HTML pages.
  • Keywords
    Web sites; data mining; data models; hypermedia markup languages; query processing; relational algebra; relational databases; Araneus; HTML documents; Internet; Web sites; Web views; algebraic expressions; data mining; data model; materialized views; navigational algebra; query evaluation; query languages; relational databases; Algebra; Bibliographies; Costs; Data models; Database languages; HTML; Information retrieval; Navigation; Query processing; Relational databases;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2002.1047768
  • Filename
    1047768