Title :
On approximate string matching
Author_Institution :
Dept. of Comput. Sci., Tel Aviv Univ., Israel
Abstract :
Two practical universal source coding schemes are proposed. One is an approximate fixed length string matching data compression, and the other is LZ-type quasi parsing by approximate string matching. It is shown that in the former algorithm the compression rate converges to the theoretical bound of R(D) for a large class of processes as the database size and the string length tend to infinity. A similar result holds for the latter algorithm in the limit of infinite data base size. The performance of the two algorithms is evaluated where data base size is finite and string length finite. The duality between the two algorithms is proved with some asymptotic properties concerning the workings of an approximate string matching algorithm for ergodic stationary sources
Keywords :
data compression; encoding; grammars; LZ-type quasi parsing; approximate string matching; asymptotic properties; compression rate; data compression; duality; ergodic stationary sources; universal source coding schemes; Computer science; Data compression; Databases; Entropy; H infinity control; Image coding; Source coding; Telephony; Terminology;
Conference_Titel :
Data Compression Conference, 1993. DCC '93.
Conference_Location :
Snowbird, UT
Print_ISBN :
0-8186-3392-1
DOI :
10.1109/DCC.1993.253135