• DocumentCode
    1968582
  • Title

    On XML structural similarity

  • Author

    Piao Yong ; Liu Chen ; Wang Xiu-Kun

  • Author_Institution
    EI Sch., Dalian Univ. of Technol., Dalian, China
  • Volume
    1
  • fYear
    2010
  • fDate
    10-11 July 2010
  • Firstpage
    448
  • Lastpage
    451
  • Abstract
    A model of XML document is extended by considering both path and frequency information, namely the frequency-path model. Based on this model, a structural similarity calculation algorithm with position and frequency weight by longest common subsequence (PFWLCS) is proposed, which is fast and has high precision. Furthermore the selection of the position and frequency factors are discussed in depth. Experiments show that the PFWLCS has higher recall ratio and accuracy than existing similarity calculation methods, especially on XML with different Structures.
  • Keywords
    XML; PFWLCS; XML document model; XML structural similarity; frequency-path model; position and frequency weight by longest common subsequence; structural similarity calculation algorithm; Algorithm design and analysis; Lead; XML; frequency weight; position weight; structure similarity; the longest common subsequence;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Industrial and Information Systems (IIS), 2010 2nd International Conference on
  • Conference_Location
    Dalian
  • Print_ISBN
    978-1-4244-7860-6
  • Type

    conf

  • DOI
    10.1109/INDUSIS.2010.5565813
  • Filename
    5565813