Title :
FMGSP: An Efficient Method of Mining Global Sequential Patterns
Author :
Zhang, Changhai ; Hu, Kongfa ; Liu, Haidong ; Ding, Youwei ; Chen, Ling
Author_Institution :
Yangzhou Univ., Yangzhou
Abstract :
Now some distributed sequential patterns mining algorithms generate too many candidate sequences, and increase communication overhead. Therefore, we propose an efficient algorithm-FMGSP (fast mining of global sequential patterns) of mining global sequential pattern on distributed system. Our method of mining sequential pattern in distributed environment differs from previous related works. Two main contributions are made in this paper. First local sequential patterns obtained on every site in distributed environment are compressed into a lexicographic sequence tree before all subtrees will be distributed into polling site, Second, an efficient pruning strategy called I/S-EP (item and sequence extension pruning) is proposed to reduce candidate sequences. Just this, the cost of communication in the network is reduced greatly when counting requests are sent (or received) to the corresponding databases. Both theories and experiments indicate that the performance of FMGSP is predominant for large databases, the global sequential patterns could be obtained effectively by the method after reducing the cost of communication.
Keywords :
data mining; distributed processing; trees (mathematics); FMGSP; distributed sequential pattern mining; global sequential pattern mining; item extension pruning; lexicographic sequence tree; sequence extension pruning; Association rules; Computer science; Costs; Data mining; Distributed databases; Itemsets; Merging; Partitioning algorithms; Pattern analysis; Proteins;
Conference_Titel :
Fuzzy Systems and Knowledge Discovery, 2007. FSKD 2007. Fourth International Conference on
Conference_Location :
Haikou
Print_ISBN :
978-0-7695-2874-8
DOI :
10.1109/FSKD.2007.294