DocumentCode :
3151022
Title :
Iterative progressive alignment method (IPAM) for multiple sequence alignment
Author :
Naznin, Farhana ; Sarker, Ruhul ; Essam, Daryl
Author_Institution :
Australian Defence Force Acad., Univ. of New South Wales, Sydney, NSW, Australia
fYear :
2009
fDate :
6-9 July 2009
Firstpage :
536
Lastpage :
541
Abstract :
In order to design life saving drugs, such as cancer drugs, the design of protein or DNA structures have to be accurate. These structures depend on multiple sequence alignment (MSA). MSA is a combinatorial optimization problem which is used to find the accurate structure of protein and DNA sequences from the existing sequences. In this paper, we have proposed a new iterative progressive alignment method, for multiple sequence alignment, which is a close variant of the MUSCEL algorithm. MUSCEL starts with the ldquokmerrdquo distance table. However, based on the gene sequences length, our algorithm starts either with the ldquokmerrdquo distance table or with the ldquodynamic programming (DP)rdquo distance table. The other steps of this algorithm include: generating a guide tree using UPGMA, multiple sequence alignments, ldquokimurardquo distance calculation from aligned sequences and new techniques to improve multiple sequence alignments. We have introduced two new techniques in this research: the first technique is to generate guide trees with randomly selected sequences and the second is of shuffling the sequences inside that tree. The output of the tree is a multiple sequence alignment which has been evaluated by the sum of pairs method (SPM) considering the real value data from PAM250. To test the performance of our algorithm, we have compared with the existing well known methods: T-Coffee, MUSCEL, MAFFT and Probcon, using BAliBase benchmarks and NCBI based our own datasets. The experimental results show that the proposed method works well for some situations, where other methods face difficulties in obtaining better solutions.
Keywords :
biocomputing; dynamic programming; iterative methods; optimisation; sequential estimation; trees (mathematics); DNA structure design; MUSCEL algorithm; UPGMA algorithm; combinatorial optimization; dynamic programming distance table; gene sequences length; guide trees; iterative progressive alignment method; kmer distance table; multiple sequence alignment; protein design; sum of pairs method; Amino acids; Australia; DNA; Drugs; Dynamic programming; Iterative algorithms; Iterative methods; Polymers; Proteins; Sequences; Dynamic Programming (DP); Guide-tree; Multiple Sequence Alignment (MSA); Progressive Alignment;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computers & Industrial Engineering, 2009. CIE 2009. International Conference on
Conference_Location :
Troyes
Print_ISBN :
978-1-4244-4135-8
Electronic_ISBN :
978-1-4244-4136-5
Type :
conf
DOI :
10.1109/ICCIE.2009.5223562
Filename :
5223562
Link To Document :
بازگشت