مرکز منطقه ای اطلاع رساني علوم و فناوري - Clustering GML documents using maximal frequent induced subtrees

DocumentCode :

2028912

Title :

Clustering GML documents using maximal frequent induced subtrees

Author :

Zhu, Ying-wen ; Ji, Gen-lin ; Sun, Qin-hong

Author_Institution :

Dept. of Comput. Found. Teaching, Sanjiang Univ., Nanjing, China

Volume :

fYear :

2010

fDate :

10-12 Aug. 2010

Firstpage :

2265

Lastpage :

2269

Abstract :

An algorithm, TBCClustering, is presented in the paper for clustering GML documents using maximal frequent induced subtree patterns. TBCClustering mines the maximal frequent induced subtrees by using the structural information of GML documents, it can get the best minimum support automatically, and then chooses a set of subtree patterns to form the optimistic clustering features. Finally it uses CLOPE algorithm to cluster the GML documents by clustering features without giving the number of clusters. Experiment results have shown that TBCClustering is more effective and efficient than PBClustering.

Keywords :

data mining; document handling; pattern clustering; trees (mathematics); CLOPE algorithm; PBClustering; TBCClustering; clustering GML document; maximal frequent induced subtree; optimistic clustering feature; structural information; Algorithm design and analysis; Clustering algorithms; Computers; Data mining; Databases; Encoding; XML; Clustering; GML document mining; Induced subtree; Maximal frequent subtree;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Fuzzy Systems and Knowledge Discovery (FSKD), 2010 Seventh International Conference on

Conference_Location :

Yantai, Shandong

Print_ISBN :

978-1-4244-5931-5

Type :

conf

DOI :

10.1109/FSKD.2010.5569321

Filename :

5569321

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2028912