DocumentCode :
2777086
Title :
Approximate string matching in sublinear expected time
Author :
Chang, William I. ; Lawler, Eugene L.
Author_Institution :
Comput. Sci. Div., California Univ., Berkeley, CA, USA
fYear :
1990
fDate :
22-24 Oct 1990
Firstpage :
116
Abstract :
The k differences approximate string matching problem specifies a text string of length n, a pattern string of length m, and the number k of differences (insertions, deletions, substitutions) allowed in a match, and asks for every location in the text where a match occurs. Previous algorithms required at least O(nk) time. When k is as large as a fraction of m, no substantial progress has been made over O (nm) dynamic programming. The authors have investigated much faster algorithms for restricted cases of the problem, such as when the text string is random and errors are not too frequent. They have devised an algorithm that, for k<m/log n+ O(1), runs in time O((n/m)k log n) on the average. In the worst case their algorithm is O(nk), but it is still an improvement in that it is very practical and uses only O(n) space compared with O(n) or O(n2). The authors define an approximate substring matching problem and give efficient algorithms based on their techniques. Special cases include several applications to genetics and molecular biology
Keywords :
computational complexity; dynamic programming; molecular biophysics; pattern recognition; search problems; approximate string matching; deletions; differences; dynamic programming; efficient algorithms; genetics; infrequent errors; insertions; molecular biology; pattern string; random string; sublinear expected time; substitutions; substring matching; text string; Assembly; Computer science; DNA; Dynamic programming; Genetics; Humans; Pattern analysis; Pattern matching; Performance analysis; Sequences;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Foundations of Computer Science, 1990. Proceedings., 31st Annual Symposium on
Conference_Location :
St. Louis, MO
Print_ISBN :
0-8186-2082-X
Type :
conf
DOI :
10.1109/FSCS.1990.89530
Filename :
89530
Link To Document :
بازگشت