Title :
A compression algorithm for DNA sequences
Author :
Chen, Xin ; Kwong, Sam ; Li, Ming
Author_Institution :
Sch. of Math. Sci., Beijing Univ., China
Abstract :
We present a DNA compression algorithm, GenCompress, based on approximate matching that gives the best compression results on standard benchmark DNA sequences. We present the design rationale of GenCompress based on approximate matching, discuss details of the algorithm, provide experimental results, and compare the results with the two most effective compression algorithms for DNA sequences (Biocompress-2 and Cfact).
Keywords :
DNA; biology computing; data compression; genetics; string matching; DNA sequences; GenCompress; approximate matching; benchmark sequences; bioinformatics; compression algorithm; edit operations; entropy; optimal prefix search; Arithmetic; Bioinformatics; Biological information theory; Compression algorithms; Computer science; DNA; Data compression; Genetic mutations; Genomics; Sequences; Algorithms; Computational Biology; DNA; Databases, Nucleic Acid; Humans; Sequence Analysis, DNA;
Journal_Title :
Engineering in Medicine and Biology Magazine, IEEE