Title :
Fast discovery of frequent closed sequential patterns based on positional data
Author :
Huang, Guo-yan ; Yang, Fei ; Hu, Chang-zhen ; Ren, Jia-dong
Author_Institution :
Coll. of Inf. Sci. & Eng., Yanshan Univ., Qinhuangdao, China
Abstract :
Frequent closed sequential patterns mining is one of the hot topics in data mining. In this paper, a novel frequent closed sequential pattern mining algorithm, FCSM-PD (frequent closed sequential pattern mining algorithm based on positional data) is proposed, which is the improved BIDE algorithm based on the positional data. The positional data is used to reserve the position information of items in the algorithm, By storing all the position information of the prefix sequences in advance, the verifying about the existence of extension of position with a prefix sequence can be easily implemented by scanning the position information of the prefix sequence, rather than scanning the pseudo-projected database repeatedly in the BI-Directional Extension closure checking scheme, which is the most consumed time phase in the algorithm of BIDE. Meanwhile optimization strategy is applied to reduce the time and memory cost in the mining process. The experimental results show that FCSM-PD costs significantly lower running time than BIDE, especially in the intensive database.
Keywords :
data mining; optimisation; bi-directional extension closure checking scheme; data mining; frequent closed sequential pattern discovery; frequent closed sequential pattern mining algorithm; optimization strategy; positional data; prefix sequences; Algorithm design and analysis; Data mining; Itemsets; Machine learning; Machine learning algorithms; Optimization; BI-Directional Extension closure check; Closed sequential pattern; Positional data;
Conference_Titel :
Machine Learning and Cybernetics (ICMLC), 2010 International Conference on
Conference_Location :
Qingdao
Print_ISBN :
978-1-4244-6526-2
DOI :
10.1109/ICMLC.2010.5581020