• DocumentCode
    3048962
  • Title

    Combining PPM models using a text mining approach

  • Author

    Teahan, W.J. ; Harper, David J.

  • Author_Institution
    Sch. of Comput. & Math. Sci., Robert Gordon Univ., Aberdeen, UK
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    153
  • Lastpage
    162
  • Abstract
    This paper introduces a novel switching method which can be used to combine two or more PPM models. The work derives from our earlier work on modelling English and text mining, and the approach takes advantage of both to help improve the compression performance significantly. The performance of the combination of models is at least as good as (and in many cases significantly better than) the best performed of the individual models. The paper reviews PPM-based text mining as it underpins the approach taken by the algorithm. It describes how PPM models are combined by applying a novel variation of the Viterbi algorithm. Results are then presented, followed by a discussion of related work, with conclusions
  • Keywords
    data compression; text analysis; English; PPM models; Viterbi algorithm; compression performance; data compression; switching method; text mining; Data compression; Data mining; Information analysis; Mathematical model; Natural languages; Performance loss; Source coding; Text mining; Uniform resource locators; Viterbi algorithm;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Compression Conference, 2001. Proceedings. DCC 2001.
  • Conference_Location
    Snowbird, UT
  • ISSN
    1068-0314
  • Print_ISBN
    0-7695-1031-0
  • Type

    conf

  • DOI
    10.1109/DCC.2001.917146
  • Filename
    917146