Title :
Computing large-scale alignments on a multi-cluster
Author :
Cehn ; Schmidt, Bertil
Author_Institution :
Sch. of Comput. Eng., Nanyang Technol. Univ., Singapore
Abstract :
Molecular biologists frequently align DNA sequences of entire genomes to detect important matched and mismatched regions. Even though efficient dynamic programming algorithms exist for this problem, the required computing time is still very high due to the size of these sequences (usually a few million base pairs in length). Because the number of sequenced organisms is increasing rapidly, fast and accurate solutions are of highest importance to research in this area. In this paper we present an algorithm to compute the optimal and near-optimal alignments of two sequences in linear space and quadratic time. We demonstrate how this algorithm can be parallelized efficiently on a PC cluster and on a computational grid in order to reduce its runtime significantly. The grid implementation uses a hierarchical approach combining inter-cluster and intra-cluster parallelism.
Keywords :
DNA; biology computing; dynamic programming; grid computing; message passing; parallel processing; workstation clusters; DNA sequences; MPI; PC cluster; computational grid; dynamic programming algorithms; genomes; grid computing; intercluster parallelism; intracluster parallelism; large-scale alignments; multicluster computing; sequence alignment; sequenced organisms; Bioinformatics; Biology computing; Biomedical computing; Clustering algorithms; DNA; Dynamic programming; Genomics; Heuristic algorithms; Large-scale systems; Message passing; Organisms; Parallel processing; Sequences;
Conference_Titel :
Cluster Computing, 2003. Proceedings. 2003 IEEE International Conference on
Print_ISBN :
0-7695-2066-9
DOI :
10.1109/CLUSTR.2003.1253297