مرکز منطقه ای اطلاع رساني علوم و فناوري - Estimation of escape probabilities for PPM based on universal source coding theory

DocumentCode :

3011769

Title :

Estimation of escape probabilities for PPM based on universal source coding theory

Author :

Åberg, J. ; Shtarkov, Yu.M. ; Smeets, B.J.M.

Author_Institution :

Dept. of Inf. Technol., Lund Univ., Sweden

fYear :

1997

fDate :

29 Jun-4 Jul 1997

Firstpage :

Abstract :

Some of the best compression ratios for text compression are provided by the PPM (prediction by partial matching) class of algorithms. These algorithms are based on arithmetic coding using a fixed-depth Markov chain model of the source, i.e., the subsequence of symbols generated in any state s of the source is assumed to be the output of a memoryless subsource w=w(s). One of the most crucial steps in any PPM algorithm is the choice of the escape probability, i.e., the probability of a previously unseen symbol appearing in a given state. Let A={1,...,M} be the source alphabet, x^k=x₁x₁...x_k∈A^k be a sequence of symbols generated by a subsource w with unknown parameters, and m=m(x^k) be the number of different symbols in x^k. In most incarnations of PPM, the expression for the escape probability used for coding is of the form ϑ(E|x^k)=k(k,m)/k+mα+k(k,m), where α is a parameter. The encoding of the “escape symbol” E is followed by a description of the new symbol, which is independent of the choice of escape probability and is not discussed. In almost all PPM schemes, the expression for the escape probability has been chosen heuristically. Following the method of Shtarkov et al. (see Probl. of Information Transmission, vol.31, no.2, p.20-35, 1995) we use a universal coding approach

Keywords :

Markov processes; arithmetic codes; data compression; document image processing; image coding; image matching; memoryless systems; prediction theory; probability; source coding; word processing; algorithms; arithmetic coding; compression ratios; encoding; escape probabilities estimation; escape symbol; fixed-depth Markov chain model; memoryless subsource; prediction by partial matching; source alphabet; symbols subsequence; text compression; universal source coding theory; Data compression; Encoding; Frequency estimation; Information theory; Optimization methods; Optimized production technology; Probability distribution; Source coding; State estimation;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Information Theory. 1997. Proceedings., 1997 IEEE International Symposium on

Conference_Location :

Ulm

Print_ISBN :

0-7803-3956-8

Type :

conf

DOI :

10.1109/ISIT.1997.612980

Filename :

612980

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3011769