DocumentCode :
2369877
Title :
Efficient data mining for maximal frequent subtrees
Author :
Xiao, Yue ; Yao, J.-F.
Author_Institution :
Dept. of Math. & Comput. Sci., Georgia Coll. & State Univ., Milledgeville, GA, USA
fYear :
2003
fDate :
19-22 Nov. 2003
Firstpage :
379
Lastpage :
386
Abstract :
A new type of tree mining is defined, which uncovers maximal frequent induced subtrees from a database of unordered labeled trees. A novel algorithm, PathJoin, is proposed. The algorithm uses a compact data structure, FST-Forest, which compresses the trees and still keeps the original tree structure. PathJoin generates candidate subtrees by joining the frequent paths in FST-Forest. Such candidate subtree generation is localized and thus substantially reduces the number of candidate subtrees. Experiments with synthetic data sets show that the algorithm is effective and efficient.
Keywords :
data mining; directed graphs; tree data structures; FST-Forest data structure; PathJoin algorithm; candidate subtree generation; data mining; maximal frequent subtrees; synthetic data sets; tree mining; unordered labeled trees database; Association rules; Bioinformatics; Computer science; Data engineering; Data mining; Databases; Educational institutions; Pattern analysis; Tree data structures; Tree graphs;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2003. ICDM 2003. Third IEEE International Conference on
Print_ISBN :
0-7695-1978-4
Type :
conf
DOI :
10.1109/ICDM.2003.1250943
Filename :
1250943
Link To Document :
بازگشت