Title :
Exact pairwise alignment of megabase genome biological sequences using a novel z-align parallel strategy
Author :
Boukerche, Azzedine ; Batista, Rodolfo Bezerra ; De Melo, Alba Cristina Magalhaes Alves
Author_Institution :
Sch. of Inf. Technol. & Eng. (SITE), Univ. of Ottawa, Ottawa, ON, Canada
Abstract :
Pairwise sequence alignment is a basic operation in bioinformatics that is performed thousands of times, in a daily basis. The exact methods proposed in the literature have quadratic time complexity. For this reason, heuristic methods such as BLAST are widely used. Nevertheless, it is known that exact methods present better sensitivity, leading to better results. To obtain exact results faster, many parallel strategies have been proposed but most of them fail to align huge biological sequences. This happens because not only the quadratic time must be considered but also the space should be reduced. In this paper, we evaluate the performance and sensibility of z-align, a parallel exact strategy that runs in user-restricted memory space. The results obtained in a 64-processor cluster show that two sequences of size 23MBP (Mega Base Pairs) and 24MBP, respectively, were successfully aligned with z-align. Also, in order to align two 3MBP sequences, a speedup of 34.35 was achieved. Finally, when comparing z-align with BLAST, we can see that the z-align alignments are longer and have a higher score.
Keywords :
bioinformatics; computational complexity; genomics; parallel algorithms; BLAST; bioinformatics; heuristic method; pairwise megabase genome biological sequence alignment; quadratic time complexity; user-restricted memory space; z-align parallel strategy; Bioinformatics; Biological system modeling; Biology; Databases; Dynamic programming; Genomics; Information technology; Organisms; Pattern matching; Sequences;
Conference_Titel :
Parallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on
Conference_Location :
Rome
Print_ISBN :
978-1-4244-3751-1
Electronic_ISBN :
1530-2075
DOI :
10.1109/IPDPS.2009.5161113