مرکز منطقه ای اطلاع رساني علوم و فناوري

DocumentCode :

3266001

Title :

Learning to compress ergodic sources

Author :

Baxter, Jonathan ; Shawe-Taylor, John

Author_Institution :

Dept. of Comput. Sci., London Univ., UK

fYear :

1996

fDate :

Mar/Apr 1996

Firstpage :

423

Abstract :

Abstract only given. We present an adaptive coding technique which is shown to achieve optimal coding in the limit as the size of the text grows, while the data structures associated with the code only grow linearly with the text. The approach relies on Huffman codes which are generated relative to the context in which a particular character occurs. The Huffman codes themselves are inferred from the data that has already been seen. A key part of the paper involves showing that the loss per character incurred by the learning process tends to zero as the size of the text tends to infinity. This involves an analysis in an on-line learning framework bounding the cumulative loss, where loss is defined to be the excess code length. By using the Bayes prediction distribution and code the expected loss per character converges to zero at the best possible rate of O(log n/n). By allowing the length of contexts to grow in response to commonly occurring subsequences, the coding is efficient precisely where it needs to be, hence achieving a high compression rate at a relatively low overhead in terms of data structure storage

Keywords :

Bayes methods; Huffman codes; adaptive codes; data structures; source coding; Bayes prediction distribution; Huffman codes; adaptive coding; compression rate; cumulative loss; data structures; ergodic sources; excess code length; loss per character; on-line learning framework; optimal coding; subsequences; text; Adaptive coding; Character generation; Computer science; Data structures; Economic forecasting; H infinity control; Mathematics;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Data Compression Conference, 1996. DCC '96. Proceedings

Conference_Location :

Snowbird, UT

ISSN :

1068-0314

Print_ISBN :

0-8186-7358-3

Type :

conf

DOI :

10.1109/DCC.1996.488351

Filename :

488351

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3266001