DocumentCode :
2707804
Title :
Off-line compression by extensible motifs
Author :
Apostolico, Alberto ; Comin, Matteo ; Parida, Laxmi
fYear :
2005
fDate :
29-31 March 2005
Firstpage :
450
Abstract :
Summary form only given. We present lossy off-line data compression techniques by textual substitution in which the patterns used in compression are chosen among the extensible motifs that are found to recur in the textstring with a minimum pre-specified frequency. A motif is to be interpreted here as a sequence of intermixed solid and don\´t care characters that obeys, in addition, some conditions of saturations: most notably, it must be not possible to eliminate some don\´t cares in the pattern without having to forfeit some of its occurrences. Motif discovery and motif-driven parses of various kinds have been previously introduced and used in Apostolico et al. (2004) and Apostolico et al. (2003). Whereas the motifs considered in those studies are "rigid", here we assume that each sequence of gaps present in a motif comes endowed with some individually prescribed degree of elasticity, whereby a same pattern may be stretched to fit segments of the source that match at all the solid characters but are otherwise of different lengths. This is expected to save on the size of the codebook, and hence to improve compression.
Keywords :
data compression; string matching; table lookup; text analysis; codebook; extensible motifs; lossy off-line data compression; motif discovery; motif-driven parses; textstring; textual substitution; Data compression; Elasticity; Encoding; Error analysis; Error correction codes; Frequency; Image coding; Pattern matching; Solids; USA Councils;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Compression Conference, 2005. Proceedings. DCC 2005
ISSN :
1068-0314
Print_ISBN :
0-7695-2309-9
Type :
conf
DOI :
10.1109/DCC.2005.59
Filename :
1402207
Link To Document :
بازگشت