DocumentCode
2369877
Title
Efficient data mining for maximal frequent subtrees
Author
Xiao, Yue ; Yao, J.-F.
Author_Institution
Dept. of Math. & Comput. Sci., Georgia Coll. & State Univ., Milledgeville, GA, USA
fYear
2003
fDate
19-22 Nov. 2003
Firstpage
379
Lastpage
386
Abstract
A new type of tree mining is defined, which uncovers maximal frequent induced subtrees from a database of unordered labeled trees. A novel algorithm, PathJoin, is proposed. The algorithm uses a compact data structure, FST-Forest, which compresses the trees and still keeps the original tree structure. PathJoin generates candidate subtrees by joining the frequent paths in FST-Forest. Such candidate subtree generation is localized and thus substantially reduces the number of candidate subtrees. Experiments with synthetic data sets show that the algorithm is effective and efficient.
Keywords
data mining; directed graphs; tree data structures; FST-Forest data structure; PathJoin algorithm; candidate subtree generation; data mining; maximal frequent subtrees; synthetic data sets; tree mining; unordered labeled trees database; Association rules; Bioinformatics; Computer science; Data engineering; Data mining; Databases; Educational institutions; Pattern analysis; Tree data structures; Tree graphs;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining, 2003. ICDM 2003. Third IEEE International Conference on
Print_ISBN
0-7695-1978-4
Type
conf
DOI
10.1109/ICDM.2003.1250943
Filename
1250943
Link To Document