• DocumentCode
    27778
  • Title

    New Techniques for Mining Frequent Patterns in Unordered Trees

  • Author

    Sen Zhang ; Zhihui Du ; Wang, Jason T. L.

  • Author_Institution
    Dept. of Math., Comput. Sci. & Stat., State Univ. of New York, Oneonta, NY, USA
  • Volume
    45
  • Issue
    6
  • fYear
    2015
  • fDate
    Jun-15
  • Firstpage
    1113
  • Lastpage
    1125
  • Abstract
    We consider a new tree mining problem that aims to discover restrictedly embedded subtree patterns from a set of rooted labeled unordered trees. We study the properties of a canonical form of unordered trees, and develop new Apriori-based techniques to generate all candidate subtrees level by level through two efficient rightmost expansion operations: 1) pairwise joining and 2) leg attachment. Next, we show that restrictedly embedded subtree detection can be achieved by calculating the restricted edit distance between a candidate subtree and a data tree. These techniques are then integrated into an efficient algorithm, named frequent restrictedly embedded subtree miner (FRESTM), to solve the tree mining problem at hand. The correctness of the FRESTM algorithm is proved and the time and space complexities of the algorithm are discussed. Experimental results on synthetic and real-world data demonstrate the effectiveness of the proposed approach.
  • Keywords
    data mining; pattern matching; trees (mathematics); Apriori-based technique; FRESTM algorithm; embedded subtree patterns; frequent patterns mining; leg attachment; pairwise joining; pattern matching; space complexity; time complexity; tree mining problem; unordered trees; Algorithm design and analysis; Computer science; Cybernetics; Data mining; Indexes; Pattern matching; Topology; Apriori algorithm; pattern matching; pattern mining and recognition; rooted labeled unordered trees; rooted labeled unordered trees.;
  • fLanguage
    English
  • Journal_Title
    Cybernetics, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    2168-2267
  • Type

    jour

  • DOI
    10.1109/TCYB.2014.2345579
  • Filename
    6878439