Title :
Patterns and exchangeability
Author :
Santhanam, N.P. ; Madiman, M.
Author_Institution :
Dept. of Electr. Eng., Univ. of Hawaii at Manoa, Honolulu, HI, USA
Abstract :
In statistics and theoretical computer science, the notion of exchangeability provides a framework for the study of large alphabet scenarios. This idea has been developed in an important line of work starting with Kingman´s study of population genetics, and leading on to the paintbox processes of Kingman, the Chinese restaurant processes and their generalizations. In information theory, the notion of the pattern of a sequence provides a framework for the study of large alphabet scenarios, as developed in work of Orlitsky and collaborators. The pattern is a statistic that captures all the information present in the data, and yet is universally compressible regardless of the alphabet size. In this note, connections are made between these two lines of work- specifically, patterns are examined in the context of exchangeability. After observing the relationship between patterns and Kingman´s paintbox processes, and discussing the redundancy of a class of mixture codes for patterns, alternate representations of patterns in terms of graph limits are discussed.
Keywords :
formal languages; information theory; pattern recognition; statistical analysis; alphabet scenarios; exchangeability; information theory; paintbox processes; sequence pattern; Bayesian methods; Collaborative work; Computer science; Genetic communication; Information theory; Natural languages; Sequences; Statistical distributions; Statistics; Testing;
Conference_Titel :
Information Theory Proceedings (ISIT), 2010 IEEE International Symposium on
Conference_Location :
Austin, TX
Print_ISBN :
978-1-4244-7890-3
Electronic_ISBN :
978-1-4244-7891-0
DOI :
10.1109/ISIT.2010.5513581