Title :
Codes for unordered sets of words
Author :
Reznik, Yuriy A.
Author_Institution :
Qualcomm Inc., San Diego, CA, USA
fDate :
July 31 2011-Aug. 5 2011
Abstract :
We study the problem of coding of unordered sets of words, appearing in language processing, retrieval, machine learning, computer vision, and other fields. We review known results about this problem, and offer a code construction technique suitable for solving it. We show that in a memoryless model the expected length of our codes approaches Ht - log m! + O(m) where m is the number of words in the set, t is the combined length of all words, and H is the entropy of the source. We also offer design of a universal code for sets of words and perform its redundancy analysis.
Keywords :
redundancy; source coding; code construction technique; computer vision; entropy; language processing; machine learning; memoryless model; redundancy analysis; retrieval; source coding; universal code; unordered sets of words; Channel coding; Computer science; Entropy; Redundancy; Source coding;
Conference_Titel :
Information Theory Proceedings (ISIT), 2011 IEEE International Symposium on
Conference_Location :
St. Petersburg
Print_ISBN :
978-1-4577-0596-0
Electronic_ISBN :
2157-8095
DOI :
10.1109/ISIT.2011.6033752