DocumentCode
3062007
Title
Fast multi-match Lempel-Ziv
Author
Pinho, M.S. ; Finamore, W.A. ; Pearlman, W.A.
Author_Institution
Rensselaer Polytech. Inst., Troy, NY, USA
fYear
1999
fDate
29-31 Mar 1999
Firstpage
545
Abstract
Summary form only given. One of the most popular encoders in the literature is the LZ78, which was proposed by Ziv and Lempel (1978). We establish a recursive way to find the longest m-tuple match. We prove the following theorem that shows how to obtain a longest (m+1)-tuple match from the longest m-tuple match. It shows that a (m+1)-tuple match is the concatenation of the first (m-1) words of the m-tuple match with the next longest double match. Therefore, the longest (m+1)-tuple match can be found using the m-tuple match and a procedure to compute the longest double match. Our theorem is as follows. Let A be a source alphabet, let A* be the set of all finite strings of A, and D⊂A*, such that if x∈D then all prefixes of x belong to D. Let u denote a one-sided infinite sequence. If b1m is the longest m-tuple match in u, with respect to D, then there is a longest (m+1)-tuple match bˆ1m+1, such that bˆ i=bi,∀i∈{1,...m-1}. We implemented the fast mmLZ and the results show a improvement in compression of around 5% over the LZW, in the Canterbury Corpus (Arnold and Bell, 1997) with little extra computational cost
Keywords
concatenated codes; data compression; sequences; string matching; Canterbury Corpus; LZ78; compression; concatenation; infinite sequence; m-tuple match; multi-match Lempel-Ziv encoder; recursive match; Algorithm design and analysis; Compression algorithms; Computational efficiency; Data compression; Dictionaries; Encoding; Numerical analysis;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Compression Conference, 1999. Proceedings. DCC '99
Conference_Location
Snowbird, UT
ISSN
1068-0314
Print_ISBN
0-7695-0096-X
Type
conf
DOI
10.1109/DCC.1999.785702
Filename
785702
Link To Document