• DocumentCode
    2503412
  • Title

    Bacteria DNA sequence compression using a mixture of finite-context models

  • Author

    Pinho, Armando J. ; Pratas, Diogo ; Ferreira, Paulo J S G

  • Author_Institution
    Signal Process. Lab., Univ. of Aveiro, Aveiro, Portugal
  • fYear
    2011
  • fDate
    28-30 June 2011
  • Firstpage
    125
  • Lastpage
    128
  • Abstract
    The ability of finite-context models for compressing DNA sequences has been demonstrated on some recent works. In this paper, we further explore this line, proposing a compression method based on eight finite-context models, with orders from two to sixteen, whose probabilities are averaged using weights calculated through a recursive procedure. The method was tested on a total of 2,338 sequences belonging to bacterial genomes, with sizes ranging from 1,286 to 13,033,779 bases, showing better compression results than the state-of-the-art XM DNA coding algorithm and also faster operation.
  • Keywords
    DNA; genomics; microorganisms; XM DNA coding algorithm; bacteria DNA sequence compression; bacterial genome; finite-context model; recursive procedure; Adaptation models; Computational modeling; Context; DNA; Data compression; Encoding; Microorganisms; DNA sequences; data compression; finite-context models;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Statistical Signal Processing Workshop (SSP), 2011 IEEE
  • Conference_Location
    Nice
  • ISSN
    pending
  • Print_ISBN
    978-1-4577-0569-4
  • Type

    conf

  • DOI
    10.1109/SSP.2011.5967637
  • Filename
    5967637