• DocumentCode
    3199233
  • Title

    Using Hidden Markov Modeling in DNA Sequencing

  • Author

    Nelson, Ruben ; Foo, Simon ; Weatherspoon, Mark

  • Author_Institution
    Florida State Univ., Tallahassee
  • fYear
    2008
  • fDate
    16-18 March 2008
  • Firstpage
    215
  • Lastpage
    217
  • Abstract
    Hidden Markov models (HMM) have largely demonstrated their usefulness in the fields of statistics and pattern recognition, particularly for speech recognition and hand writing recognition. In the field of genetics, the same principles of statistics and probability can be applied. DNA primarily has four bases: adenine, guanine, thymine, and cytosine, which when paired together can form nucleotides. However, the length of a nucleotide chain can be uncertain. The DNA sequence constitutes the heritable genetic information in nuclei that forms the basis for the developmental programs of all living organisms. Determining the DNA sequence is therefore useful in studying fundamental biological processes, as well as in diagnostic or forensic research. In this study, we will utilize hidden Markov models (HMM) to determine DNA sequence likelihoods. A training sequence of nucleotide bases of the first 1000 bases of rice chromosomes will be used, and the transition and emission probabilities would determine a probable DNA sequence of the next 2000 bases. This sequence should be comparable to the actual sequence. However, experimentation did not show this to be the case, despite previous experiments showing otherwise. Only a fourth of a nucleotide sequence was ever classified correctly.
  • Keywords
    biocomputing; handwriting recognition; hidden Markov models; speech recognition; statistics; DNA sequencing; adenine; cytosine; guanine; hand writing recognition; hidden Markov modeling; nucleotide chain; pattern recognition; probability; speech recognition; statistics; thymine; DNA; Genetics; Hidden Markov models; Organisms; Pattern recognition; Probability; Sequences; Speech recognition; Statistics; Writing; DNA sequencing; Hidden Markov Model;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    System Theory, 2008. SSST 2008. 40th Southeastern Symposium on
  • Conference_Location
    New Orleans, LA
  • ISSN
    0094-2898
  • Print_ISBN
    978-1-4244-1806-0
  • Electronic_ISBN
    0094-2898
  • Type

    conf

  • DOI
    10.1109/SSST.2008.4480223
  • Filename
    4480223