Title of article :
Statistical learning formulation of the DNA base-calling problem and its solution in a Bayesian EM framework Original Research Article
Author/Authors :
Manuela S. Pereira، نويسنده , , Lucio Andrade، نويسنده , , Sameh El Difrawy، نويسنده , , Barry L. Karger، نويسنده , , Elias S. Manolakos، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2000
Pages :
30
From page :
229
To page :
258
Abstract :
A novel formulation of the important DNA sequence base-calling problem as well as algorithms for its solution are introduced. The proposed approach is to bring DNA base-calling within the framework of a powerful statistical learning paradigm, which allows the incorporation of prior knowledge about the structure of the problem directly into the base-calling algorithms, without resorting to heuristics. Use of prior knowledge provides constraints which help disambiguate the different possible interpretations that the data may have at regions of low SNR, and is shown to lead to a substantial increase of the number of DNA bases that can be accurately called in such regions. Our experimental results suggest that the proposed algorithms, without being optimized, can achieve base-calling performance that matches, and often exceeds, that of commercially available software. Furthermore, due to their statistical basis, they also provide confidence estimates (in the form of posterior probabilities) for the produced base call decisions, which can be used for sequence assembly and mutation detection purposes.
Keywords :
DNA base-calling , Statistical learning , Expectation-maximization algorithm
Journal title :
Discrete Applied Mathematics
Serial Year :
2000
Journal title :
Discrete Applied Mathematics
Record number :
885119
Link To Document :
بازگشت