Title : 
Efficient data mining for maximal frequent subtrees
         
        
            Author : 
Xiao, Yue ; Yao, J.-F.
         
        
            Author_Institution : 
Dept. of Math. & Comput. Sci., Georgia Coll. & State Univ., Milledgeville, GA, USA
         
        
        
        
        
        
            Abstract : 
A new type of tree mining is defined, which uncovers maximal frequent induced subtrees from a database of unordered labeled trees. A novel algorithm, PathJoin, is proposed. The algorithm uses a compact data structure, FST-Forest, which compresses the trees and still keeps the original tree structure. PathJoin generates candidate subtrees by joining the frequent paths in FST-Forest. Such candidate subtree generation is localized and thus substantially reduces the number of candidate subtrees. Experiments with synthetic data sets show that the algorithm is effective and efficient.
         
        
            Keywords : 
data mining; directed graphs; tree data structures; FST-Forest data structure; PathJoin algorithm; candidate subtree generation; data mining; maximal frequent subtrees; synthetic data sets; tree mining; unordered labeled trees database; Association rules; Bioinformatics; Computer science; Data engineering; Data mining; Databases; Educational institutions; Pattern analysis; Tree data structures; Tree graphs;
         
        
        
        
            Conference_Titel : 
Data Mining, 2003. ICDM 2003. Third IEEE International Conference on
         
        
            Print_ISBN : 
0-7695-1978-4
         
        
        
            DOI : 
10.1109/ICDM.2003.1250943