Title :
XCut: Indexing XML Data for Efficient Twig Evaluation
Author :
Sheu, Simon ; Wu, Nigel
Author_Institution :
National Tsing Hua University, Taiwan
Abstract :
Efficient evaluation of XML queries entails fast finding all the occurrences of a twig pattern, a subgraph of treestructured XML documents. One approach models the pattern as a query tree directly, and associates a cursor and a stack to each tree node for coordinated sequential access and partial match tally of all the XML elements. Even if pre-built indices on the elements can facilitate skipping mismatches, this approach works well only when the pattern purely involves ancestor-descendent edges. Another strategy first transforms XML data and query trees one-to-one into sequences. Then, string indices are used for subsequence matching to achieve equivalent skips in transform domain. This strategy performs better when the twig contains ordered parent-child edges and nodes with their own selection predicates. This paper proposes to replace the stacks used in the first approach by simple pipes (double-ended queues) to queue too inner entries of index trees accessed by level-order traversal. Our design permits the cross-verification among queued entries subject to the twig pattern. Early filtration at the inner level of index trees is thus achievable to further expedite processing. Joint with the leaf entries having full specifications of query elements, our design can also suppress any subgraph that violates parent-child edges and selection predicates at query tree nodes. Intensive experimental results also demonstrate such superiority over the second strategy.
Keywords :
Bidirectional control; Content based retrieval; Costs; Filtration; Indexing; Information systems; Pattern matching; Query processing; Tree data structures; XML;
Conference_Titel :
Data Engineering, 2006. ICDE '06. Proceedings of the 22nd International Conference on
Print_ISBN :
0-7695-2570-9
DOI :
10.1109/ICDE.2006.176