DocumentCode :
778383
Title :
On universal types
Author :
Seroussi, Gadiel
Author_Institution :
Hewlett-Packard Labs., Palo Alto, CA
Volume :
52
Issue :
1
fYear :
2006
Firstpage :
171
Lastpage :
189
Abstract :
The universal type class of a sequence xn is defined, in analogy to the notion underlying the classical method of types. Two sequences of the same length are said to be of the same universal (LZ) type if and only if they yield the same dictionary (or, equivalently, parsing tree) in the incremental parsing of Ziv and Lempel (1978). It is shown that for any finite order k, the variational distance between the kth-order empirical probability distributions of two sequences of the same universal type vanishes as the sequence length tends to infinity. Consequently, for any k and any kth-order probability assignment, the difference between the normalized logarithms of the probabilities assigned to two sequences of the same universal type also vanishes asymptotically. The size of a universal type class is studied, and it is shown that its asymptotic behavior parallels that of the conventional counterpart, with the LZ78 code length playing the role of the empirical entropy. The number of universal types for sequences of length n is estimated, and shown to be of the form exp((1+o(1))gamman/logn) for a well characterized constant gamma. Algorithms for enumerating the sequences in a universal type class, and for drawing a sequence from the class with uniform probability are described. As an application, the problem of universal simulation of individual sequences is considered. A sequence drawn with uniform probability from the universal type class of xn is an optimal simulation of xn in a well defined mathematical sense
Keywords :
data compression; entropy codes; grammars; probability; random codes; random processes; random sequences; Lempel-Ziv coding; code length; dictionary; empirical entropy; incremental parsing; normalized logarithm; optimal simulation; probability distribution; random process simulation; sequence length; universal type class; Channel coding; Dictionaries; Entropy; H infinity control; Helium; Informatics; Information theory; Probability distribution; Random processes; Statistical distributions; Lempel-Ziv coding; method of types; random process simulation; type classes; universal simulation; universal types;
fLanguage :
English
Journal_Title :
Information Theory, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9448
Type :
jour
DOI :
10.1109/TIT.2005.860437
Filename :
1564433
Link To Document :
بازگشت