DocumentCode :
1219803
Title :
Performance and architectural issues for string matching
Author :
Isenman, Mernll E. ; Shasha, Dennis E.
Author_Institution :
AT&T Bell Lab., Holmdel, NJ, USA
Volume :
39
Issue :
2
fYear :
1990
fDate :
2/1/1990 12:00:00 AM
Firstpage :
238
Lastpage :
250
Abstract :
The authors introduce special heuristics to the Knuth-Morris-Pratt algorithm to reduce the time and space required to perform the string matching. They compare their hardware-based approach to the software approaches embodied in the Unix system grep and fgrep commands. Simulation results show that the hardware approach can provide a 25-500-fold performance improvement, depending on the complexity of the query, and that it is fast enough, even in the presence of variable-length `don´t cares´ to keep up with a 20-million character/second disk. The approach compares favorably to other hardware designs in speed and space. The proposed hardware implementation requires 10 kB of one cycle static memory, 28 single-character comparators, four 16-b adders, and control logic for four finite-state machines with a term-matcher controller. After that, additional hardware produces negligible performance improvements for queries with up to 80 terms, about half of which have variable-length `don´t cares´
Keywords :
database management systems; finite automata; information retrieval; Knuth-Morris-Pratt algorithm; Unix system; architectural issues; control logic; finite-state machines; hardware implementation; hardware-based approach; performance issues; software approaches; string matching; term-matcher controller; Automata; Hardware; Information retrieval; Law; Legal factors; Logic; Medical simulation; Pattern matching; Software libraries; Testing;
fLanguage :
English
Journal_Title :
Computers, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9340
Type :
jour
DOI :
10.1109/12.45209
Filename :
45209
Link To Document :
بازگشت