DocumentCode :
571562
Title :
Probability Model for Boundaries of Short-Read Sequencing
Author :
Schatz, Florian ; Wienbrandt, Lars ; Schimmler, Manfred
Author_Institution :
Dept. of Comput. Sci., Christian-Albrechts-Univ. zu Kiel, Kiel, Germany
fYear :
2012
fDate :
9-11 Aug. 2012
Firstpage :
223
Lastpage :
228
Abstract :
The need for sequencing DNA has been growing tremendously over the past few years. Current next-generation sequencing techniques produce huge amounts of data but time and money remain limiting factors for researchers. Given a DNA sample, it is essential to produce a sufficient number of reads to create or recreate a digital representation of the DNA while minimizing the needed resources. This work proposes a theoretical model that yields a set of formulas to calculate amongst others the expected distribution of contig length and estimated N50 value for a low-coverage, short-read sequencing experiment. The formulas can be used as an extension to the well known Lander-Waterman model to model assembly projects. The only input parameters these formulas are based on are the DNA sequence length, the number of reads and the read length. These formulas can provide boundaries (e.g. N50) that can be calculated before a sequencing process in order to reduce or adjust the needed resources for resequencing or de novo assembly and to get enough, but not too much, information or estimate the feasibility of a sequencing project.
Keywords :
DNA; molecular biophysics; probability; sampling methods; DNA sampling; DNA sequence length; Lander-Waterman model; contig length; de novo assembly; digital representation; model assembly projects; next-generation sequencing techniques; probability model; short-read sequence; short-read sequencing experiment; Assembly; Bioinformatics; DNA; Data models; Estimation; Genomics; Humans; Alignment; Coverage; Multiple sequence alignment; Next-generation sequencing; Sequencing; Statistical genetics;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advances in Computing and Communications (ICACC), 2012 International Conference on
Conference_Location :
Cochin, Kerala
Print_ISBN :
978-1-4673-1911-9
Type :
conf
DOI :
10.1109/ICACC.2012.51
Filename :
6305594
Link To Document :
بازگشت