• DocumentCode
    2369877
  • Title

    Efficient data mining for maximal frequent subtrees

  • Author

    Xiao, Yue ; Yao, J.-F.

  • Author_Institution
    Dept. of Math. & Comput. Sci., Georgia Coll. & State Univ., Milledgeville, GA, USA
  • fYear
    2003
  • fDate
    19-22 Nov. 2003
  • Firstpage
    379
  • Lastpage
    386
  • Abstract
    A new type of tree mining is defined, which uncovers maximal frequent induced subtrees from a database of unordered labeled trees. A novel algorithm, PathJoin, is proposed. The algorithm uses a compact data structure, FST-Forest, which compresses the trees and still keeps the original tree structure. PathJoin generates candidate subtrees by joining the frequent paths in FST-Forest. Such candidate subtree generation is localized and thus substantially reduces the number of candidate subtrees. Experiments with synthetic data sets show that the algorithm is effective and efficient.
  • Keywords
    data mining; directed graphs; tree data structures; FST-Forest data structure; PathJoin algorithm; candidate subtree generation; data mining; maximal frequent subtrees; synthetic data sets; tree mining; unordered labeled trees database; Association rules; Bioinformatics; Computer science; Data engineering; Data mining; Databases; Educational institutions; Pattern analysis; Tree data structures; Tree graphs;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2003. ICDM 2003. Third IEEE International Conference on
  • Print_ISBN
    0-7695-1978-4
  • Type

    conf

  • DOI
    10.1109/ICDM.2003.1250943
  • Filename
    1250943