Title :
Estimating the information content of symbol sequences and efficient codes
Author :
Grassberger, Peter
Author_Institution :
Dept. of Phys., Wuppertal Univ., West Germany
fDate :
5/1/1989 12:00:00 AM
Abstract :
Several variants of an algorithm for estimating Shannon entropies of symbol sequences are presented. They are all related to the Lempel-Ziv algorithm (1976, 1977) and to recent algorithms for estimating Hausdorff dimensions. The average storage and running times increase as N and Nlog N, respectively, with the sequence length N. These algorithms proceed basically by constructing efficient codes. They seem to be the optimal algorithms for sequences with strong long-range correlations, e.g. natural languages. An application to written English illustrates their use
Keywords :
FORTRAN listings; binary sequences; codes; entropy; information theory; Hausdorff dimensions; Lempel-Ziv algorithm; Shannon entropy; efficient codes; information content; strong long-range correlations; symbol sequences; written English; Binary codes; Binary sequences; Disk recording; Entropy; Gaussian processes; Information theory; Natural languages; Optical recording; Physics; Probability;
Journal_Title :
Information Theory, IEEE Transactions on