Title :
Mining Frequent Induced Subtree Patterns with Subtree-Constraint
Author :
Lei Zou ; Yansheng Lu ; Huaming Zhang ; Rong Hu
Author_Institution :
HuaZhong Univ. of Sci. & Technol., Wuhan
Abstract :
Mining frequent induced subtree patterns is very useful in domains such as XML databases, Web log analyzing. However, because of the combinatorial explosion, mining all frequent subtree patterns becomes infeasible for a large and dense tree database. And too many frequent subtree patterns also confuse users. Usually only a small set of the mining results can arouse users´ interests. In this paper, we propose a problem to discover frequent induced subtree patterns that are super trees of a given pattern tree specified by users, i.e. frequent induced subtree patterns with subtree-constraint. Most existing frequent subtree mining algorithms are based on right-most extension, which does not work well in the new problem. So free extension is presented to replace right-most extension in this paper. To avoid the duplicate pattern problem caused by free extension, we develop an efficient method that ensures no duplicate patterns in mining process or results. Then subtree-constraint frequent subtree patterns mining algorithm, i.e. SCFS algorithm, is given. The experiment results also show that our algorithm achieves good performance
Keywords :
data mining; tree data structures; trees (mathematics); data mining; frequent induced subtree patterns; subtree-constraint frequent subtree patterns mining algorithm; tree database; Conferences; Data analysis; Data mining; Databases; Explosions; Indexes; Pattern analysis; Tree graphs; XML;
Conference_Titel :
Data Mining Workshops, 2006. ICDM Workshops 2006. Sixth IEEE International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
0-7695-2702-7
DOI :
10.1109/ICDMW.2006.112