DocumentCode
3199233
Title
Using Hidden Markov Modeling in DNA Sequencing
Author
Nelson, Ruben ; Foo, Simon ; Weatherspoon, Mark
Author_Institution
Florida State Univ., Tallahassee
fYear
2008
fDate
16-18 March 2008
Firstpage
215
Lastpage
217
Abstract
Hidden Markov models (HMM) have largely demonstrated their usefulness in the fields of statistics and pattern recognition, particularly for speech recognition and hand writing recognition. In the field of genetics, the same principles of statistics and probability can be applied. DNA primarily has four bases: adenine, guanine, thymine, and cytosine, which when paired together can form nucleotides. However, the length of a nucleotide chain can be uncertain. The DNA sequence constitutes the heritable genetic information in nuclei that forms the basis for the developmental programs of all living organisms. Determining the DNA sequence is therefore useful in studying fundamental biological processes, as well as in diagnostic or forensic research. In this study, we will utilize hidden Markov models (HMM) to determine DNA sequence likelihoods. A training sequence of nucleotide bases of the first 1000 bases of rice chromosomes will be used, and the transition and emission probabilities would determine a probable DNA sequence of the next 2000 bases. This sequence should be comparable to the actual sequence. However, experimentation did not show this to be the case, despite previous experiments showing otherwise. Only a fourth of a nucleotide sequence was ever classified correctly.
Keywords
biocomputing; handwriting recognition; hidden Markov models; speech recognition; statistics; DNA sequencing; adenine; cytosine; guanine; hand writing recognition; hidden Markov modeling; nucleotide chain; pattern recognition; probability; speech recognition; statistics; thymine; DNA; Genetics; Hidden Markov models; Organisms; Pattern recognition; Probability; Sequences; Speech recognition; Statistics; Writing; DNA sequencing; Hidden Markov Model;
fLanguage
English
Publisher
ieee
Conference_Titel
System Theory, 2008. SSST 2008. 40th Southeastern Symposium on
Conference_Location
New Orleans, LA
ISSN
0094-2898
Print_ISBN
978-1-4244-1806-0
Electronic_ISBN
0094-2898
Type
conf
DOI
10.1109/SSST.2008.4480223
Filename
4480223
Link To Document