• DocumentCode
    147107
  • Title

    Compression of Quality Factors in Next Generation Sequencing

  • Author

    Nalbantoglu, O.U. ; Sayood, K.

  • Author_Institution
    Dept. of Electr. Eng., Univ. of Nebraska, Lincoln, NE, USA
  • fYear
    2014
  • fDate
    26-28 March 2014
  • Firstpage
    419
  • Lastpage
    419
  • Abstract
    We propose a compression algorithm for the quality scores contained in FASTQ files which are generated in large volumes during high throughput sequencing. The proposed algorithm is a context dependent arithmetic coder which is based on observations of the structure of quality scores in FASTQ files. Simulation results indicate a significantly superior performance of the algorithm to the current state of the art.
  • Keywords
    Q-factor; arithmetic codes; data compression; FASTQ files; compression algorithm; context dependent arithmetic coder; high throughput sequencing; next generation sequencing; quality factors compression; quality scores; Context; Data compression; Educational institutions; Electrical engineering; Next generation networking; Q-factor; Sequential analysis; Biological sequence compression; DNA; Quality factor;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Compression Conference (DCC), 2014
  • Conference_Location
    Snowbird, UT
  • ISSN
    1068-0314
  • Type

    conf

  • DOI
    10.1109/DCC.2014.46
  • Filename
    6824471