DocumentCode :
2743664
Title :
Using partial-matching approach with Sequitur for contact-based coding
Author :
Lakhani, Gopal ; Radhakrishnan Sethurama
Author_Institution :
Texas Tech. Univ., Lubbock, TX, USA
fYear :
2004
fDate :
23-25 March 2004
Firstpage :
548
Abstract :
This paper presents a Sequitur, which is a powerful realization of the grammar-based text compression approach. First, it derives a context-free grammar, which generates the given input as its only language. Then it represents the grammar as a single sequence of symbols, where a symbol represents either a rule head or an alphabet, and finally it uses arithmetic coder to code the sequence. Sequitur encodes each symbol individually. A modified approach is to code a rule symbol in the context of the last alphabet of the preceding rule symbol. Thereby, the partial matching (PM) approach is followed to realize code reduction. The crux of this approach is to maintain multiple dictionaries, each containing rules that begin with the same alphabet.
Keywords :
arithmetic codes; context-free grammars; data compression; sequential codes; text analysis; Sequitur; Sequitur encoding; arithmetic coder; code reduction; contact-based coding; context-free grammar; grammar-based text compression approach; partial-matching approach; preceding rule symbol; rule head; sequence code; symbol representation; symbol sequence; text dictionary; Arithmetic; Data compression; Decoding; Dictionaries;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Compression Conference, 2004. Proceedings. DCC 2004
ISSN :
1068-0314
Print_ISBN :
0-7695-2082-0
Type :
conf
DOI :
10.1109/DCC.2004.1281524
Filename :
1281524
Link To Document :
بازگشت