DocumentCode
3048962
Title
Combining PPM models using a text mining approach
Author
Teahan, W.J. ; Harper, David J.
Author_Institution
Sch. of Comput. & Math. Sci., Robert Gordon Univ., Aberdeen, UK
fYear
2001
fDate
2001
Firstpage
153
Lastpage
162
Abstract
This paper introduces a novel switching method which can be used to combine two or more PPM models. The work derives from our earlier work on modelling English and text mining, and the approach takes advantage of both to help improve the compression performance significantly. The performance of the combination of models is at least as good as (and in many cases significantly better than) the best performed of the individual models. The paper reviews PPM-based text mining as it underpins the approach taken by the algorithm. It describes how PPM models are combined by applying a novel variation of the Viterbi algorithm. Results are then presented, followed by a discussion of related work, with conclusions
Keywords
data compression; text analysis; English; PPM models; Viterbi algorithm; compression performance; data compression; switching method; text mining; Data compression; Data mining; Information analysis; Mathematical model; Natural languages; Performance loss; Source coding; Text mining; Uniform resource locators; Viterbi algorithm;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Compression Conference, 2001. Proceedings. DCC 2001.
Conference_Location
Snowbird, UT
ISSN
1068-0314
Print_ISBN
0-7695-1031-0
Type
conf
DOI
10.1109/DCC.2001.917146
Filename
917146
Link To Document