DocumentCode
1966416
Title
A new heuristic for multiple sequence alignment
Author
Agrawal, Ankit ; Khaitan, Siddhartha Kumar
Author_Institution
Dept. of Comput. Sci., Iowa State Univ., Ames, IA
fYear
2008
fDate
18-20 May 2008
Firstpage
215
Lastpage
217
Abstract
Multiple sequence alignment is an important task in bioinformatics which forms the basis of many other tasks like protein structure prediction, protein function prediction and phylogenetic analysis. An optimal multiple sequence alignment for a given set of sequences is difficult to determine by standard dynamic programming algorithms because of impractically high computational complexity. Therefore, heuristics are commonly used to reduce the alignment construction time, even though the resulting alignment may not be optimal. ClustalW is one of the most popular progressive multiple sequence alignment algorithm where the pairwise sequence alignments are done according to a static guide-tree based on the sequences alone. For a multiple sequence alignment of N sequences, each of length O(n), the ClustalW algorithm takes O(N2n2) time. In this paper, a modification to the algorithm is proposed which reduces the time complexity of the algorithm to O(N log2 Nn2). The proposed heuristic also makes the alignment process more dynamic as the order of sequences added to the multiple sequence alignment also depends on the already computed multiple sequence alignments.
Keywords
biology computing; computational complexity; trees (mathematics); ClustalW algorithm; bioinformatics; multiple sequence alignment algorithm; pairwise sequence alignment; static guide-tree; Bioinformatics; Computational complexity; Computer science; Databases; Dynamic programming; Heuristic algorithms; Iterative algorithms; Iterative methods; Phylogeny; Protein engineering;
fLanguage
English
Publisher
ieee
Conference_Titel
Electro/Information Technology, 2008. EIT 2008. IEEE International Conference on
Conference_Location
Ames, IA
Print_ISBN
978-1-4244-2029-2
Electronic_ISBN
978-1-4244-2030-8
Type
conf
DOI
10.1109/EIT.2008.4554299
Filename
4554299
Link To Document