DocumentCode :
1518258
Title :
A compression algorithm for DNA sequences
Author :
Chen, Xin ; Kwong, Sam ; Li, Ming
Author_Institution :
Sch. of Math. Sci., Beijing Univ., China
Volume :
20
Issue :
4
fYear :
2001
Firstpage :
61
Lastpage :
66
Abstract :
We present a DNA compression algorithm, GenCompress, based on approximate matching that gives the best compression results on standard benchmark DNA sequences. We present the design rationale of GenCompress based on approximate matching, discuss details of the algorithm, provide experimental results, and compare the results with the two most effective compression algorithms for DNA sequences (Biocompress-2 and Cfact).
Keywords :
DNA; biology computing; data compression; genetics; string matching; DNA sequences; GenCompress; approximate matching; benchmark sequences; bioinformatics; compression algorithm; edit operations; entropy; optimal prefix search; Arithmetic; Bioinformatics; Biological information theory; Compression algorithms; Computer science; DNA; Data compression; Genetic mutations; Genomics; Sequences; Algorithms; Computational Biology; DNA; Databases, Nucleic Acid; Humans; Sequence Analysis, DNA;
fLanguage :
English
Journal_Title :
Engineering in Medicine and Biology Magazine, IEEE
Publisher :
ieee
ISSN :
0739-5175
Type :
jour
DOI :
10.1109/51.940049
Filename :
940049
Link To Document :
بازگشت