Title :
On XML structural similarity
Author :
Piao Yong ; Liu Chen ; Wang Xiu-Kun
Author_Institution :
EI Sch., Dalian Univ. of Technol., Dalian, China
Abstract :
A model of XML document is extended by considering both path and frequency information, namely the frequency-path model. Based on this model, a structural similarity calculation algorithm with position and frequency weight by longest common subsequence (PFWLCS) is proposed, which is fast and has high precision. Furthermore the selection of the position and frequency factors are discussed in depth. Experiments show that the PFWLCS has higher recall ratio and accuracy than existing similarity calculation methods, especially on XML with different Structures.
Keywords :
XML; PFWLCS; XML document model; XML structural similarity; frequency-path model; position and frequency weight by longest common subsequence; structural similarity calculation algorithm; Algorithm design and analysis; Lead; XML; frequency weight; position weight; structure similarity; the longest common subsequence;
Conference_Titel :
Industrial and Information Systems (IIS), 2010 2nd International Conference on
Conference_Location :
Dalian
Print_ISBN :
978-1-4244-7860-6
DOI :
10.1109/INDUSIS.2010.5565813