• DocumentCode
    2847579
  • Title

    On the sequencing of tree structures for XML indexing

  • Author

    Wang, Haixun ; Meng, Xiaofeng

  • Author_Institution
    IBM Thomas J. Watson Res. Center, NY, USA
  • fYear
    2005
  • fDate
    5-8 April 2005
  • Firstpage
    372
  • Lastpage
    383
  • Abstract
    Sequence-based XML indexing aims at avoiding expensive join operations in query processing. It transforms structured XML data into sequences so that a structured query can be answered holistically through subsequence matching. In this paper, we address the problem of query equivalence with respect to this transformation, and we introduce a performance-oriented principle for sequencing tree structures. With query equivalence, XML queries can be performed through subsequence matching without join operations, post-processing, or other special handling for problems such as false alarms. We identify a class of sequencing methods for this purpose, and we present a novel subsequence matching algorithm that observe query equivalence. Still, query equivalence is just a prerequisite for sequence-based XML indexing. Our goal is to find the best sequencing strategy with regard to the time and space complexity in indexing and querying XML data. To this end, we introduce a performance-oriented principle to guide the sequencing of tree structures. For any given XML data set, the principle finds an optimal sequencing strategy according to its schema and its data distribution. We present a novel method that realizes this principle. In our experiments, we show the advantages of sequence-based indexing over traditional XML indexing methods, and we compare several sequencing strategies and demonstrate the benefit of the performance-oriented sequencing principle.
  • Keywords
    XML; database indexing; query languages; query processing; tree data structures; XML data querying; XML query processing; join operation; query equivalence; sequence-based XML indexing; space complexity; structured query; subsequence matching algorithm; time complexity; tree structure sequencing; Data mining; Database languages; Indexing; Intrusion detection; Neural networks; Project management; Query processing; Research and development management; Tree data structures; XML;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering, 2005. ICDE 2005. Proceedings. 21st International Conference on
  • ISSN
    1084-4627
  • Print_ISBN
    0-7695-2285-8
  • Type

    conf

  • DOI
    10.1109/ICDE.2005.98
  • Filename
    1410144