DocumentCode :
2767650
Title :
An Approach for Measuring Similarity between XML Documents
Author :
Ling, Song ; Shengen, Li ; Qiang, Lv ; Wei, He ; Tongjiang, Yang
Author_Institution :
Sch. of Comput. Sci. & Technol., Shandong Jianzhu Univ., Jinan, China
Volume :
7
fYear :
2009
fDate :
14-16 Aug. 2009
Firstpage :
410
Lastpage :
414
Abstract :
With the widespread diffusion of semi-structured data in XML format, algorithms for mining information from XML documents are becoming increasingly important. So a similarity function is the key of a successful XML data management process. In this paper, we propose a new method to measure the similarity between XML documents by considering their structures and contents, which comprises three layer matching: element matching, path matching and document matching. The similarity of document´s structure is found by partial matching technique and that of document´s contents is found by taking into account of the syntactic information, semantic information and position of elements.
Keywords :
XML; data mining; data structures; pattern matching; XML data management process; XML documents; document matching; element matching; information mining; layer matching; path matching; semistructured data; similarity measurement; Computer science; Fuzzy systems; Helium; Indexing; Information analysis; Information retrieval; Knowledge management; Query processing; Text analysis; XML; XML matching; element matching; path matching; similarity;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery, 2009. FSKD '09. Sixth International Conference on
Conference_Location :
Tianjin
Print_ISBN :
978-0-7695-3735-1
Type :
conf
DOI :
10.1109/FSKD.2009.412
Filename :
5360042
Link To Document :
بازگشت