DocumentCode
1299487
Title
Extended XML Tree Pattern Matching: Theories and Algorithms
Author
Lu, Jiaheng ; Ling, Tok Wang ; Bao, Zhifeng ; Wang, Chen
Author_Institution
Key Lab. of Data Eng. & Knowledge Eng., Renmin Univ. of China, Beijing, China
Volume
23
Issue
3
fYear
2011
fDate
3/1/2011 12:00:00 AM
Firstpage
402
Lastpage
416
Abstract
As business and enterprises generate and exchange XML data more often, there is an increasing need for efficient processing of queries on XML data. Searching for the occurrences of a tree pattern query in an XML database is a core operation in XML query processing. Prior works demonstrate that holistic twig pattern matching algorithm is an efficient technique to answer an XML tree pattern with parent-child (P-C) and ancestor-descendant (A-D) relationships, as it can effectively control the size of intermediate results during query processing. However, XML query languages (e.g., XPath and XQuery) define more axes and functions such as negation function, order-based axis, and wildcards. In this paper, we research a large set of XML tree pattern, called extended XML tree pattern, which may include P-C, A-D relationships, negation functions, wildcards, and order restriction. We establish a theoretical framework about “matching cross” which demonstrates the intrinsic reason in the proof of optimality on holistic algorithms. Based on our theorems, we propose a set of novel algorithms to efficiently process three categories of extended XML tree patterns. A set of experimental results on both real-life and synthetic data sets demonstrate the effectiveness and efficiency of our proposed theories and algorithms.
Keywords
XML; query languages; query processing; XML database; XPath; XQuery; ancestor-descendant relationships; extended XML tree pattern matching; holistic twig pattern matching algorithm; matching cross; negation function; order-based axis; parent-child relationships; query languages; query processing; wildcards; Algorithm design and analysis; Bismuth; Databases; Labeling; Pattern matching; Pediatrics; XML; Query processing; XML/XSL/RDF; algorithms; tree pattern.;
fLanguage
English
Journal_Title
Knowledge and Data Engineering, IEEE Transactions on
Publisher
ieee
ISSN
1041-4347
Type
jour
DOI
10.1109/TKDE.2010.126
Filename
5551129
Link To Document