• DocumentCode
    1966416
  • Title

    A new heuristic for multiple sequence alignment

  • Author

    Agrawal, Ankit ; Khaitan, Siddhartha Kumar

  • Author_Institution
    Dept. of Comput. Sci., Iowa State Univ., Ames, IA
  • fYear
    2008
  • fDate
    18-20 May 2008
  • Firstpage
    215
  • Lastpage
    217
  • Abstract
    Multiple sequence alignment is an important task in bioinformatics which forms the basis of many other tasks like protein structure prediction, protein function prediction and phylogenetic analysis. An optimal multiple sequence alignment for a given set of sequences is difficult to determine by standard dynamic programming algorithms because of impractically high computational complexity. Therefore, heuristics are commonly used to reduce the alignment construction time, even though the resulting alignment may not be optimal. ClustalW is one of the most popular progressive multiple sequence alignment algorithm where the pairwise sequence alignments are done according to a static guide-tree based on the sequences alone. For a multiple sequence alignment of N sequences, each of length O(n), the ClustalW algorithm takes O(N2n2) time. In this paper, a modification to the algorithm is proposed which reduces the time complexity of the algorithm to O(N log2 Nn2). The proposed heuristic also makes the alignment process more dynamic as the order of sequences added to the multiple sequence alignment also depends on the already computed multiple sequence alignments.
  • Keywords
    biology computing; computational complexity; trees (mathematics); ClustalW algorithm; bioinformatics; multiple sequence alignment algorithm; pairwise sequence alignment; static guide-tree; Bioinformatics; Computational complexity; Computer science; Databases; Dynamic programming; Heuristic algorithms; Iterative algorithms; Iterative methods; Phylogeny; Protein engineering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Electro/Information Technology, 2008. EIT 2008. IEEE International Conference on
  • Conference_Location
    Ames, IA
  • Print_ISBN
    978-1-4244-2029-2
  • Electronic_ISBN
    978-1-4244-2030-8
  • Type

    conf

  • DOI
    10.1109/EIT.2008.4554299
  • Filename
    4554299