DocumentCode
2743664
Title
Using partial-matching approach with Sequitur for contact-based coding
Author
Lakhani, Gopal ; Radhakrishnan Sethurama
Author_Institution
Texas Tech. Univ., Lubbock, TX, USA
fYear
2004
fDate
23-25 March 2004
Firstpage
548
Abstract
This paper presents a Sequitur, which is a powerful realization of the grammar-based text compression approach. First, it derives a context-free grammar, which generates the given input as its only language. Then it represents the grammar as a single sequence of symbols, where a symbol represents either a rule head or an alphabet, and finally it uses arithmetic coder to code the sequence. Sequitur encodes each symbol individually. A modified approach is to code a rule symbol in the context of the last alphabet of the preceding rule symbol. Thereby, the partial matching (PM) approach is followed to realize code reduction. The crux of this approach is to maintain multiple dictionaries, each containing rules that begin with the same alphabet.
Keywords
arithmetic codes; context-free grammars; data compression; sequential codes; text analysis; Sequitur; Sequitur encoding; arithmetic coder; code reduction; contact-based coding; context-free grammar; grammar-based text compression approach; partial-matching approach; preceding rule symbol; rule head; sequence code; symbol representation; symbol sequence; text dictionary; Arithmetic; Data compression; Decoding; Dictionaries;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Compression Conference, 2004. Proceedings. DCC 2004
ISSN
1068-0314
Print_ISBN
0-7695-2082-0
Type
conf
DOI
10.1109/DCC.2004.1281524
Filename
1281524
Link To Document