Title :
Near optimal multiple sequence alignments using a traveling salesman problem approach
Author :
Korostensky, Chantal ; Gonnet, Gaston
Author_Institution :
Inst. of Sci. Comput., Fed. Inst. of Technol., Zurich, Switzerland
Abstract :
We present a new method for the calculation of multiple sequence alignments (MSAs). The input to our problem are n protein sequences. We assume that the sequences are related with each other and that there exists some unknown evolutionary tree that corresponds to the MSA. One advantage of our method is that the scoring can be done with reference to this phylogenetic tree, even though the tree structure itself may remain unknown. Instead of computing an evolutionary tree, we only need to compute a circular tour of the tree which is determined via a traveling salesman problem (TSP) algorithm. Our algorithm can calculate a near optimal MSA and has a performance guarantee of n-1/n.opt (where opt is the optimal score of the MSA). The algorithm runs in O(k2 n2) time, where k is the length of the longest input sequence. From there, we improve the alignment further. Experimental results are shown at the end
Keywords :
biology computing; computational complexity; travelling salesman problems; tree data structures; TSP algorithm; circular tour; evolutionary tree; longest input sequence; near optimal MSA; near optimal multiple sequence alignments; optimal score; performance guarantee; phylogenetic tree; protein sequences; scoring; traveling salesman problem approach; tree structure; Amino acids; Biology computing; Computational biology; DNA; Optimized production technology; Proteins; Scientific computing; Sequences; Traveling salesman problems; Tree data structures;
Conference_Titel :
String Processing and Information Retrieval Symposium, 1999 and International Workshop on Groupware
Conference_Location :
Cancun
Print_ISBN :
0-7695-0268-7
DOI :
10.1109/SPIRE.1999.796584