DocumentCode
636252
Title
In-depth analysis of interrelation between quality scores and real errors in illumina reads
Author
Sunyoung Kwon ; Seunghyun Park ; Byunghan Lee ; Sungroh Yoon
Author_Institution
Interdiscipl. Program in Bioinf., Seoul Nat. Univ., Seoul, South Korea
fYear
2013
fDate
3-7 July 2013
Firstpage
635
Lastpage
638
Abstract
In sequencing results, the quality score is reported for each base, representing the probability that the base is called incorrectly. The notion of quality scores was initially developed for conventional Sanger sequencing, but is widely used for next-generation sequencing techniques, including Illumina. In this paper, we carry out in-depth analysis of quality scores reported for Illumina reads and present how they are related to real errors in the reads. We confirmed strong interrelation between quality scores and real errors in Illumina reads, and observed that reverse reads tend to have lower quality scores than forward reads in paired-end reads do. In addition, we discovered other interesting patterns from quality score analysis. Our hope is that the findings in this paper will be helpful for designing error-correction and/or filtering methods for next-generation sequencing.
Keywords
genetics; genomics; molecular configurations; Illumina read errors; Illumina read quality scores; conventional Sanger sequencing; incorrectly called base; next generation sequencing techniques; quality score analysis; sequencing results; Bioinformatics; Educational institutions; Error analysis; Filtering; Genomics; Next generation networking; Sequential analysis;
fLanguage
English
Publisher
ieee
Conference_Titel
Engineering in Medicine and Biology Society (EMBC), 2013 35th Annual International Conference of the IEEE
Conference_Location
Osaka
ISSN
1557-170X
Type
conf
DOI
10.1109/EMBC.2013.6609580
Filename
6609580
Link To Document