• DocumentCode
    1299487
  • Title

    Extended XML Tree Pattern Matching: Theories and Algorithms

  • Author

    Lu, Jiaheng ; Ling, Tok Wang ; Bao, Zhifeng ; Wang, Chen

  • Author_Institution
    Key Lab. of Data Eng. & Knowledge Eng., Renmin Univ. of China, Beijing, China
  • Volume
    23
  • Issue
    3
  • fYear
    2011
  • fDate
    3/1/2011 12:00:00 AM
  • Firstpage
    402
  • Lastpage
    416
  • Abstract
    As business and enterprises generate and exchange XML data more often, there is an increasing need for efficient processing of queries on XML data. Searching for the occurrences of a tree pattern query in an XML database is a core operation in XML query processing. Prior works demonstrate that holistic twig pattern matching algorithm is an efficient technique to answer an XML tree pattern with parent-child (P-C) and ancestor-descendant (A-D) relationships, as it can effectively control the size of intermediate results during query processing. However, XML query languages (e.g., XPath and XQuery) define more axes and functions such as negation function, order-based axis, and wildcards. In this paper, we research a large set of XML tree pattern, called extended XML tree pattern, which may include P-C, A-D relationships, negation functions, wildcards, and order restriction. We establish a theoretical framework about “matching cross” which demonstrates the intrinsic reason in the proof of optimality on holistic algorithms. Based on our theorems, we propose a set of novel algorithms to efficiently process three categories of extended XML tree patterns. A set of experimental results on both real-life and synthetic data sets demonstrate the effectiveness and efficiency of our proposed theories and algorithms.
  • Keywords
    XML; query languages; query processing; XML database; XPath; XQuery; ancestor-descendant relationships; extended XML tree pattern matching; holistic twig pattern matching algorithm; matching cross; negation function; order-based axis; parent-child relationships; query languages; query processing; wildcards; Algorithm design and analysis; Bismuth; Databases; Labeling; Pattern matching; Pediatrics; XML; Query processing; XML/XSL/RDF; algorithms; tree pattern.;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2010.126
  • Filename
    5551129