Title : 
Compression of Quality Factors in Next Generation Sequencing
         
        
            Author : 
Nalbantoglu, O.U. ; Sayood, K.
         
        
            Author_Institution : 
Dept. of Electr. Eng., Univ. of Nebraska, Lincoln, NE, USA
         
        
        
        
        
        
            Abstract : 
We propose a compression algorithm for the quality scores contained in FASTQ files which are generated in large volumes during high throughput sequencing. The proposed algorithm is a context dependent arithmetic coder which is based on observations of the structure of quality scores in FASTQ files. Simulation results indicate a significantly superior performance of the algorithm to the current state of the art.
         
        
            Keywords : 
Q-factor; arithmetic codes; data compression; FASTQ files; compression algorithm; context dependent arithmetic coder; high throughput sequencing; next generation sequencing; quality factors compression; quality scores; Context; Data compression; Educational institutions; Electrical engineering; Next generation networking; Q-factor; Sequential analysis; Biological sequence compression; DNA; Quality factor;
         
        
        
        
            Conference_Titel : 
Data Compression Conference (DCC), 2014
         
        
            Conference_Location : 
Snowbird, UT
         
        
        
        
            DOI : 
10.1109/DCC.2014.46