Title :
A maximum-likelihood base caller for DNA sequencing
Author :
Brady, David ; Kocic, Marko ; Miller, Arthur W. ; Karger, Barry L.
Author_Institution :
Dept. of Electr. & Comput. Eng., Northeastern Univ., Boston, MA, USA
Abstract :
The procedures used to sequence the human genome involve the electrophoretic separation of mixtures of dioxyribonucleic acid (DNA) fragments tagged with reporting groups, usually fluorescent dyes. Each fluorescent pulse which arrives from an optical detector corresponds to a nucleotide (base) in the DNA sequence, and the subsequent process of base detection is known as base calling. Generating longer and more accurate sequences in the base-calling process will reduce the high cost of DNA sequencing. This paper presents an automated base-calling algorithm, referred to as maximum-likelihood base caller (MLB), which is based on maximum likelihood equalization for digital communication channels. Based on 125 experimental datasets, MLB averaged up to 40% fewer errors than the widely used ABI base caller from the Applied Biosystems Division of PE Corporation. MLB´s accuracy rivaled that of another well-known base caller, Phred, surpassing it on datasets with high background noise.
Keywords :
DNA; biological techniques; biology computing; electrophoresis; maximum likelihood sequence estimation; molecular biophysics; Applied Biosystems Division of PE Corporation; Phred; automated base-calling algorithm; digital communication channels; experimental datasets; fluorescent pulse; maximum likelihood equalization; maximum-likelihood base caller; optical detector; Bioinformatics; Costs; DNA; Fluorescence; Genomics; Humans; Maximum likelihood detection; Optical detectors; Optical pulses; Sequences; Algorithms; Base Sequence; Biomedical Engineering; DNA; Databases, Factual; Humans; Likelihood Functions; Sequence Analysis, DNA;
Journal_Title :
Biomedical Engineering, IEEE Transactions on