DocumentCode :
2629993
Title :
XML document clustering based on common tag names anywhere in the structure
Author :
Alishahi, Mohamad ; Ravakhah, Mehdi ; Shakeriaski, Baharak ; Naghibzade, Mahmud
Author_Institution :
Islamic Azad Univ. Mashhad Branch, Mashhad, Iran
fYear :
2009
fDate :
20-21 Oct. 2009
Firstpage :
588
Lastpage :
595
Abstract :
One of the most effective ways to extract knowledge from large information resources is applying data mining methods. Since the amount of information on the Internet is exploding, using XML documents is common as they have many advantages. Knowledge extraction from XML documents is a way to provide more utilizable results. XCLS is one of the most efficient algorithms for XML documents clustering. In this paper we represent a new algorithm for clustering XML documents. This algorithm is an improvement over XCLS algorithm which tries to obviate its problems. We implemented both algorithms and evaluated their clustering quality and running time on the same data sets. In both cases, it is shown that the performance of the new algorithm is better.
Keywords :
XML; data mining; document handling; pattern clustering; XCLS algorithm; XML document clustering; clustering quality; data mining methods; information resources; knowledge extraction; Association rules; Clustering algorithms; Data mining; Information resources; Internet; Neural networks; Search engines; Tree data structures; Web sites; XML; XML documents; clustering; data mining; level similarity; level structure;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Conference, 2009. CSICC 2009. 14th International CSI
Conference_Location :
Tehran
Print_ISBN :
978-1-4244-4261-4
Electronic_ISBN :
978-1-4244-4262-1
Type :
conf
DOI :
10.1109/CSICC.2009.5349643
Filename :
5349643
Link To Document :
بازگشت