DocumentCode :
1779949
Title :
The capacity of string-duplication systems
Author :
Farnoud, Farzad ; Schwartz, M. ; Bruck, Jehoshua
Author_Institution :
Electr. Eng., California Inst. of Technol., Pasadena, CA, USA
fYear :
2014
fDate :
June 29 2014-July 4 2014
Firstpage :
1301
Lastpage :
1305
Abstract :
It is known that the majority of the human genome consists of repeated sequences. Furthermore, it is believed that a significant part of the rest of the genome also originated from repeated sequences and has mutated to its current form. In this paper, we investigate the possibility of constructing an exponentially large number of sequences from a short initial sequence and simple duplication rules, including those resembling genomic duplication processes. In other words, our goal is to find out the capacity, or the expressive power, of these string-duplication systems. Our results include the exact capacities, and bounds on the capacities, of four fundamental string-duplication systems.
Keywords :
biocomputing; genomics; image sequences; string matching; duplication rules; genomic duplication processes; human genome; repeated sequences; string-duplication systems; Automata; Bioinformatics; DNA; Evolution (biology); Formal languages; Genomics; Information theory;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Theory (ISIT), 2014 IEEE International Symposium on
Conference_Location :
Honolulu, HI
Type :
conf
DOI :
10.1109/ISIT.2014.6875043
Filename :
6875043
Link To Document :
بازگشت