DocumentCode :
2255987
Title :
Fast discovery of frequent closed sequential patterns based on positional data
Author :
Huang, Guo-yan ; Yang, Fei ; Hu, Chang-zhen ; Ren, Jia-dong
Author_Institution :
Coll. of Inf. Sci. & Eng., Yanshan Univ., Qinhuangdao, China
Volume :
1
fYear :
2010
fDate :
11-14 July 2010
Firstpage :
444
Lastpage :
449
Abstract :
Frequent closed sequential patterns mining is one of the hot topics in data mining. In this paper, a novel frequent closed sequential pattern mining algorithm, FCSM-PD (frequent closed sequential pattern mining algorithm based on positional data) is proposed, which is the improved BIDE algorithm based on the positional data. The positional data is used to reserve the position information of items in the algorithm, By storing all the position information of the prefix sequences in advance, the verifying about the existence of extension of position with a prefix sequence can be easily implemented by scanning the position information of the prefix sequence, rather than scanning the pseudo-projected database repeatedly in the BI-Directional Extension closure checking scheme, which is the most consumed time phase in the algorithm of BIDE. Meanwhile optimization strategy is applied to reduce the time and memory cost in the mining process. The experimental results show that FCSM-PD costs significantly lower running time than BIDE, especially in the intensive database.
Keywords :
data mining; optimisation; bi-directional extension closure checking scheme; data mining; frequent closed sequential pattern discovery; frequent closed sequential pattern mining algorithm; optimization strategy; positional data; prefix sequences; Algorithm design and analysis; Data mining; Itemsets; Machine learning; Machine learning algorithms; Optimization; BI-Directional Extension closure check; Closed sequential pattern; Positional data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Cybernetics (ICMLC), 2010 International Conference on
Conference_Location :
Qingdao
Print_ISBN :
978-1-4244-6526-2
Type :
conf
DOI :
10.1109/ICMLC.2010.5581020
Filename :
5581020
Link To Document :
بازگشت