Title :
A novel DNA sequence compression scheme using both intra and inter sequences correlation
Author :
K. O. Cheng;N. F. Law;W. C. Siu
Author_Institution :
Centre for Signal Processing, Department of Electronic and Information Engineering, the Hong Kong Polytechnic University, Hong Kong
Abstract :
Classical DNA sequence compression algorithms consider only intra-sequence similarity, i.e., similar subsequences within the DNA sequence are found and encoded together. In this work, in addition to the intra-sequence similarity, we exploit the inter-sequence similarities in that similar subsequences are found within the DNA sequence as well as from other reference sequences. Hence, highly similar sequences from the same population or partially similar chromosome sequences of the same species can be compressed together to reduce the storage space. Experimental results show that the proposed scheme achieves good compressibility for both partially similar chromosome sequences and highly similar population sequences.
Keywords :
"Biological cells","DNA","Encoding","Sociology","Statistics","Decoding","Compression algorithms"
Conference_Titel :
Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2015 Asia-Pacific
DOI :
10.1109/APSIPA.2015.7415512