Title :
A Partition-Based Approach for Sequential Patterns Mining
Author :
Nguyen, Son N. ; Orlowska, Maria E.
Author_Institution :
Univ. of Queensland, Brisbane, QLD
Abstract :
Sequential patterns mining has been explored for various data types, and its computational complexity is well understood. There are well-known methods to deal effectively with computational problems such as GSP [1] and PrefixSpan [2]. However, most methods show limited performance due to the exponential number of growing patterns. Moreover when the input data set is very large, it is unsolvable because of main memory limitation. This paper shows a partition-based approach to overcome this drawback, and to provide further performance enhancements of sequential patterns computation. Furthermore, the partition-based approach can be extended to the parallel paradigm of mining sequential patterns. We have made a series of observations that has led us to invent data pre-processing methods such that the final step of the partition-based algorithm, where a combination of all local candidate patterns must be processed, is executed on substantially smaller input data. This paper shows results from several experiments that confirmed our general and formally presented observations.
Keywords :
computational complexity; data mining; pattern recognition; GSP; PrefixSpan; computational complexity; data pre-processing methods; partition-based approach; sequential pattern mining; Australia; Computational complexity; Data mining; Databases; Fault detection; Fault diagnosis; Itemsets; Partitioning algorithms; Pattern analysis; Pattern matching; Algorithm; Data Mining; Sequential Patterns;
Conference_Titel :
Research, Innovation and Vision for the Future, 2007 IEEE International Conference on
Conference_Location :
Hanoi
Print_ISBN :
1-4244-0694-3
DOI :
10.1109/RIVF.2007.369157