Title :
Common words between two random strings
Author_Institution :
INRIA, Rocquencourt
Abstract :
We investigate the problem of enumerating the words that are common to two random strings. We show that when the source models are memoryless that the number of common words is sublinear in the length of the sequence, and linear when the source models are exactly the same. We draw the same conclusions for the number of common nodes in the associated respective suffix trees.
Keywords :
memoryless systems; random codes; source coding; string matching; trees (mathematics); common words; memoryless source model; random strings; suffix trees; Autocorrelation; Cats; Convergence; Dogs; Genetics; Information analysis; Information theory; Random number generation; Random sequences; Statistical analysis;
Conference_Titel :
Information Theory, 2007. ISIT 2007. IEEE International Symposium on
Conference_Location :
Nice
Print_ISBN :
978-1-4244-1397-3
DOI :
10.1109/ISIT.2007.4557431