DocumentCode
2422419
Title
FMGSP: An Efficient Method of Mining Global Sequential Patterns
Author
Zhang, Changhai ; Hu, Kongfa ; Liu, Haidong ; Ding, Youwei ; Chen, Ling
Author_Institution
Yangzhou Univ., Yangzhou
Volume
2
fYear
2007
fDate
24-27 Aug. 2007
Firstpage
761
Lastpage
765
Abstract
Now some distributed sequential patterns mining algorithms generate too many candidate sequences, and increase communication overhead. Therefore, we propose an efficient algorithm-FMGSP (fast mining of global sequential patterns) of mining global sequential pattern on distributed system. Our method of mining sequential pattern in distributed environment differs from previous related works. Two main contributions are made in this paper. First local sequential patterns obtained on every site in distributed environment are compressed into a lexicographic sequence tree before all subtrees will be distributed into polling site, Second, an efficient pruning strategy called I/S-EP (item and sequence extension pruning) is proposed to reduce candidate sequences. Just this, the cost of communication in the network is reduced greatly when counting requests are sent (or received) to the corresponding databases. Both theories and experiments indicate that the performance of FMGSP is predominant for large databases, the global sequential patterns could be obtained effectively by the method after reducing the cost of communication.
Keywords
data mining; distributed processing; trees (mathematics); FMGSP; distributed sequential pattern mining; global sequential pattern mining; item extension pruning; lexicographic sequence tree; sequence extension pruning; Association rules; Computer science; Costs; Data mining; Distributed databases; Itemsets; Merging; Partitioning algorithms; Pattern analysis; Proteins;
fLanguage
English
Publisher
ieee
Conference_Titel
Fuzzy Systems and Knowledge Discovery, 2007. FSKD 2007. Fourth International Conference on
Conference_Location
Haikou
Print_ISBN
978-0-7695-2874-8
Type
conf
DOI
10.1109/FSKD.2007.294
Filename
4406178
Link To Document