• DocumentCode
    2743664
  • Title

    Using partial-matching approach with Sequitur for contact-based coding

  • Author

    Lakhani, Gopal ; Radhakrishnan Sethurama

  • Author_Institution
    Texas Tech. Univ., Lubbock, TX, USA
  • fYear
    2004
  • fDate
    23-25 March 2004
  • Firstpage
    548
  • Abstract
    This paper presents a Sequitur, which is a powerful realization of the grammar-based text compression approach. First, it derives a context-free grammar, which generates the given input as its only language. Then it represents the grammar as a single sequence of symbols, where a symbol represents either a rule head or an alphabet, and finally it uses arithmetic coder to code the sequence. Sequitur encodes each symbol individually. A modified approach is to code a rule symbol in the context of the last alphabet of the preceding rule symbol. Thereby, the partial matching (PM) approach is followed to realize code reduction. The crux of this approach is to maintain multiple dictionaries, each containing rules that begin with the same alphabet.
  • Keywords
    arithmetic codes; context-free grammars; data compression; sequential codes; text analysis; Sequitur; Sequitur encoding; arithmetic coder; code reduction; contact-based coding; context-free grammar; grammar-based text compression approach; partial-matching approach; preceding rule symbol; rule head; sequence code; symbol representation; symbol sequence; text dictionary; Arithmetic; Data compression; Decoding; Dictionaries;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Compression Conference, 2004. Proceedings. DCC 2004
  • ISSN
    1068-0314
  • Print_ISBN
    0-7695-2082-0
  • Type

    conf

  • DOI
    10.1109/DCC.2004.1281524
  • Filename
    1281524