DocumentCode :
2737877
Title :
Workshop: Flexible read decomposition for improved short read error correction
Author :
Yang, Xiao ; Dorman, Karin S. ; Aluru, Srinivas
Author_Institution :
Dept. of Electr. & Comput. Eng., Iowa State Univ., Ames, IA, USA
fYear :
2011
fDate :
3-5 Feb. 2011
Firstpage :
277
Lastpage :
277
Abstract :
Error correction is often an important first step prior to analyzing reads from next-generation DNA sequencers. This talk will be focused on a flexible read decomposition method developed to improve the accuracy of error correction and make it more computationally efficient. The method relies on decomposing a read by overlapping tiles, each containing two or more kmers that serve as the basis for error correction. While the value of k is chosen so that the kmer occurs with sufficient frequency, the surrounding kmers within the tile provide the context for improving specificity and resolve ambiguity. The method adopts a flexible tile decomposition strategy to make swift progress in regions with sparse occurrence of errors and thorough exploration in regions of clustered errors. Space usage is reduced by avoiding explicit construction and storage of relationships among kmers, instead relying on the creation of space-efficient data structures that can compute such information on the fly. Experimental verification on benchmark data sets from Illumina Genome Analyzer confirms that the method achieves high error correction accuracy, while the improvements in run-time and memory usage enable scaling to large data sets.
Keywords :
DNA; biology computing; data structures; error correction; molecular biophysics; molecular configurations; Illumina Genome Analyzer; flexible read decomposition; flexible tile decomposition; improved short read error correction; kmers; memory usage; next-generation DNA sequencers; run time; space-efficient data structures; Accuracy; Bioinformatics; Conferences; Error correction; Genomics; Tiles;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Advances in Bio and Medical Sciences (ICCABS), 2011 IEEE 1st International Conference on
Conference_Location :
Orlando, FL
Print_ISBN :
978-1-61284-851-8
Type :
conf
DOI :
10.1109/ICCABS.2011.5729931
Filename :
5729931
Link To Document :
بازگشت