• DocumentCode
    464286
  • Title

    Overestimation for Multiple Sequence Alignment

  • Author

    Cazenave, Tristan

  • Author_Institution
    Dept. Informatique, Univ. Paris 8
  • fYear
    2007
  • fDate
    1-5 April 2007
  • Firstpage
    159
  • Lastpage
    164
  • Abstract
    Multiple sequence alignment is an important problem in computational biology. A-star is an algorithm that can be used to find exact alignments. We present a simple modification of the A-star algorithm that improves much multiple sequence alignment, both in time and memory, at the cost of a small accuracy loss. It consists in overestimating the admissible heuristic. A typical speedup for random sequences of length two hundred fifty is 47 associated to a memory gain of 13 with an error rate of 0.09%. Concerning real sequences, the speedup can be greater than 20,000 and the memory gain greater than 150, the error rate being in the range from 0.08% to 0.67% for the sequences we have tested. Overestimation can align sequences that are not possible to align with the exact algorithm
  • Keywords
    biology computing; sequences; A-star algorithm; computational biology; multiple sequence alignment; overestimation; random sequences; Bioinformatics; Computational biology; Computational intelligence; Costs; DNA; Dynamic programming; Error analysis; Lattices; Random sequences; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence and Bioinformatics and Computational Biology, 2007. CIBCB '07. IEEE Symposium on
  • Conference_Location
    Honolulu, HI
  • Print_ISBN
    1-4244-0710-9
  • Type

    conf

  • DOI
    10.1109/CIBCB.2007.4221218
  • Filename
    4221218