مرکز منطقه ای اطلاع رساني علوم و فناوري - Efficient Discovery of Frequent Approximate Sequential Patterns

DocumentCode :

3167107

Title :

Efficient Discovery of Frequent Approximate Sequential Patterns

Author :

Zhu, Feida ; Yan, Xifeng ; Han, Jiawei ; Yu, Philip S.

Author_Institution :

Univ. of Illinois at Urbana-Champaign, Champaign

fYear :

2007

fDate :

28-31 Oct. 2007

Firstpage :

751

Lastpage :

756

Abstract :

We propose an efficient algorithm for mining frequent approximate sequential patterns under the Hamming distance model. Our algorithm gains its efficiency by adopting a "break-down-and-build-up" methodology. The "breakdown" is based on the observation that all occurrences of a frequent pattern can be classified into groups, which we call strands. We developed efficient algorithms to quickly mine out all strands by iterative growth. In the "build-up" stage, these strands are grouped up to form the support sets from which all approximate patterns would be identified. A salient feature of our algorithm is its ability to grow the frequent patterns by iteratively assembling building blocks of significant sizes in a local search fashion. By avoiding incremental growth and global search, we achieve greater efficiency without losing the completeness of the mining result. Our experimental studies demonstrate that our algorithm is efficient in mining globally repeating approximate sequential patterns that would have been missed by existing methods.

Keywords :

data mining; search problems; Hamming distance model; approximate sequential patterns; break-down-and-build-up methodology; frequent approximate sequential patterns; global search; incremental growth; Assembly; Bioinformatics; DNA; Data analysis; Data mining; Genomics; Hamming distance; Iterative algorithms; Pattern analysis; Sequences;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Data Mining, 2007. ICDM 2007. Seventh IEEE International Conference on

Conference_Location :

Omaha, NE

ISSN :

1550-4786

Print_ISBN :

978-0-7695-3018-5

Type :

conf

DOI :

10.1109/ICDM.2007.75

Filename :

4470322

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3167107