Title :
Entropy of Patterns of i.i.d. Sequences—Part I: General Bounds
Author_Institution :
Univ. of Utah, Salt Lake City
fDate :
5/1/2008 12:00:00 AM
Abstract :
Tight bounds on the block entropy of patterns of sequences generated by independent and identically distributed (i.i.d.) sources are derived. A pattern of a sequence is a sequence of integer indices with each index representing the order of first occurrence of the respective symbol in the original sequence. Since a pattern is the result of data processing on the original sequence, its entropy cannot be larger. Bounds derived here describe the pattern entropy as function of the original i.i.d. source entropy, the alphabet size, the symbol probabilities, and their arrangement in the probability space. Matching upper and lower bounds derived provide a useful tool for very accurate approximations of pattern block entropies for various distributions, and for assessing the decrease of the pattern entropy from that of the original i.i.d. sequence.
Keywords :
block codes; entropy codes; sequences; alphabet size; coding; iid sequences; pattern block entropy; source entropy; symbol probabilities; Cities and towns; Communication system control; Conferences; Data processing; Dictionaries; Entropy; Gas insulated transmission lines; Information theory; Pattern matching; Uncertainty; Entropy; index sequences; patterns;
Journal_Title :
Information Theory, IEEE Transactions on
DOI :
10.1109/TIT.2008.920203