• DocumentCode
    1804403
  • Title

    A Path-sequence Based Discrimination for Subtree Matching in Approximate XML Joins

  • Author

    Liang, Wenxin ; Yokota, Haruo

  • Author_Institution
    Tokyo Institute of Technology, Japan
  • fYear
    2006
  • fDate
    2006
  • Abstract
    In this paper, we discuss the one-to-multiple matching problem in leaf-clustering based approximate XML join algorithms and propose a path-sequence based discrimination method to solve this problem. In our method, each path sequence from the top node to the matched leaf in the base and target subtree is extracted, and the most similar target subtree for the base one is determined by the pathsequence based subtree similarity degree. We conduct experiments to evaluate our method by using both real bibliography and bioinformatics XML documents. The experimental results show that our method can effectively decrease the occunence rate of one-to-multiple matching for both bibliography and bioinformatics XML data, and hence improve the precision of the leaf-clustering based approximate XML join algorithms.
  • Keywords
    Bibliographies; Bioinformatics; Clustering algorithms; Computer science; Conferences; Data engineering; Internet; XML;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering Workshops, 2006. Proceedings. 22nd International Conference on
  • Conference_Location
    Atlanta, GA, USA
  • Print_ISBN
    0-7695-2571-7
  • Type

    conf

  • DOI
    10.1109/ICDEW.2006.15
  • Filename
    1623911