Title :
A probabilistic approach for long read-length DNA sequence analysis
Author :
Molina, Chrigtophe G. ; Mullikin, Jim
Author_Institution :
Sanger Centre, Wellcome Trust Genome Campus, Cambridge, MA, USA
Abstract :
This paper introduces a new algorithm for DNA sequence analysis, based on the use of a reference DNA sequence for the estimation of base positions, and a probabilistic modelling of trace peaks. The new algorithm has been applied to long read-length DNA sequences and its performance has been compared to the base-calling program Phred. The results reported in this paper, after cross-matching with a finished consensus, show a significant improvement by the new algorithm in the final sequence read-length and in the number of correct bases extracted from DNA traces.
Keywords :
DNA; molecular biophysics; probability; DNA traces; algorithm; base-calling program Phred; correct bases extracted number; cross-matching; final sequence read-length; finished consensus; long read-length DNA sequence analysis; probabilistic approach; trace peaks; Algorithm design and analysis; Bioinformatics; DNA; Genomics; Humans; Image sequence analysis; Libraries; Phase estimation; Signal analysis; Signal processing algorithms;
Conference_Titel :
Neural Networks for Signal Processing, 2002. Proceedings of the 2002 12th IEEE Workshop on
Print_ISBN :
0-7803-7616-1
DOI :
10.1109/NNSP.2002.1030016