DocumentCode
1983990
Title
Pattern Matching with Wildcard Gaps Based on Cross List
Author
Junyan Zhang ; Chenhui Yang
Author_Institution
Key Lab. of Pattern Recognition & Intell. Inf. Process. of Sichuan, Chengdu Univ., Chengdu, China
Volume
2
fYear
2013
fDate
28-29 Oct. 2013
Firstpage
154
Lastpage
156
Abstract
Pattern matching is a fundamental application text retrieval, string query, biological sequence analysis, etc. Therefore, the effective algorithm performing this kind of matching is in great need. In this paper, the wildcard is defines to match any one character in a sequence. Multiple wildcards form a gap. The length of a flexible gap is arbitrary. We design CLPM algorithm by use of cross list index structure to realize pattern matching with flexible wildcard gaps. The preprocessing algorithm is designed to initialize cross list so as to reduce searching space. In CLPM algorithm, the effective intervals is defined and computed based on the start positions of each sub pattern in each string, which help to obtain matching result set. Moreover, the approximate pattern matching is converted to short extract pattern matching. The contrast experiments are done based on DBLP tile data set. The results show that CLMP algorithm has better performance in the same fields.
Keywords
data structures; string matching; CLPM algorithm design; DBLP tile data set; arbitrary flexible wildcard gap length; cross-list index structure; pattern matching approximation; preprocessing algorithm; search space reduction; sequence character matching; subpattern start positions; Algorithm design and analysis; Approximation algorithms; Educational institutions; Indexes; Pattern matching; Presses; Silicon; cross list; pattern matching; wildcard gaps;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Intelligence and Design (ISCID), 2013 Sixth International Symposium on
Conference_Location
Hangzhou
Type
conf
DOI
10.1109/ISCID.2013.152
Filename
6804851
Link To Document