Title :
Efficiently Mining Unordered Trees
Author :
Chehreghani, Mostafa Haghir
Author_Institution :
Dept. of Comput. Sci., Katholieke Univ. Leuven, Leuven, Belgium
Abstract :
Frequent tree patterns have many applications in different domains such as XML document mining, user web log analysis, network routing and bioinformatics. In this paper, we first introduce three new tree encodings and accordingly present an efficient algorithm for finding frequent patterns from rooted unordered trees with the assumption that children of every node in database trees are identically labeled. Then, we generalize the method and propose the UITree algorithm to find frequent patterns from rooted unordered trees without any restriction. Compared to other algorithms in the literature, UItree manages occurrences of a candidate tree in database trees more efficiently. Our extensive experiments on both real and synthetic datasets show that UITree significantly outperforms the most efficient existing works on mining unordered trees.
Keywords :
XML; data mining; tree data structures; UITree algorithm; XML document mining; bioinformatics; database trees; efficiently mining unordered trees; frequent tree patterns; network routing; user web log analysis; Bioinformatics; Clustering algorithms; Data mining; Databases; Encoding; Routing; XML; Frequent tree patterns; candidate generation; frequency counting; rooted unordered trees; tree encoding;
Conference_Titel :
Data Mining (ICDM), 2011 IEEE 11th International Conference on
Conference_Location :
Vancouver,BC
Print_ISBN :
978-1-4577-2075-8
DOI :
10.1109/ICDM.2011.62