DocumentCode :
1801675
Title :
Mining top-K Frequent and Flexible Pattern from sequences
Author :
Zhang Junyan ; Min Fan
Author_Institution :
Information Science and Technology College, Chengdu University, China
fYear :
2013
fDate :
1-8 Jan. 2013
Firstpage :
1
Lastpage :
5
Abstract :
Pattern Mining is a popular issue in biological sequence analysis. With the introduction of wildcard gaps, more interesting patterns can be mined. In this paper, we propose a new definition related to pattern frequency, under which the Apriori property holds. We define a pattern mining problem called Ming top-K Frequent Patterns (MFP), where gaps are mined instead of specified. Compared with existing problems, MFP does not require any domain knowledge of the user. However, theoretical analysis and experimental results show that MFP favors inflexible patterns. We then define another problem where the flexibility threshold of each gap is specified by the user. The problem is called Mining top-K Frequent and Flexible Patterns (MF2P). We develop algorithm with polynomial complexities for both problems. Patterns can grow from both sides. Some interesting biological patterns mined by our algorithms are discussed.
Keywords :
constraints; pattern mining; sequence; wildcard gap;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Conference Anthology, IEEE
Conference_Location :
China
Type :
conf
DOI :
10.1109/ANTHOLOGY.2013.6784804
Filename :
6784804
Link To Document :
بازگشت