DocumentCode :
506028
Title :
Large-scale maximum likelihood-based phylogenetic analysis on the IBM BlueGene/L
Author :
Ott, Michael ; Zola, Jaroslaw ; Stamatakis, Alexandros ; Aluru, Srinivas
Author_Institution :
Technical University of Munich
fYear :
2007
fDate :
10-16 Nov. 2007
Firstpage :
1
Lastpage :
11
Abstract :
Phylogenetic inference is a grand challenge in Bioinformatics due to immense computational requirements. The increasing popularity of multi-gene alignments in biological studies, which typically provide a stable topological signal due to a more favorable ratio of the number of base pairs to the number of sequences, coupled with rapid accumulation of sequence data in general, poses new challenges for high performance computing. In this paper, we demonstrate how state-of-the-art Maximum Likelihood (ML) programs can be efficiently scaled to the IBM BlueGene/L (BG/L) architecture, by porting RAxML, which is currently among the fastest and most accurate programs for phylogenetic inference under the ML criterion. We simultaneously exploit coarse-grained and fine-grained parallelism that is inherent in every ML-based biological analysis. Performance is assessed using datasets consisting of 212 sequences and 566,470 base pairs, and 2,182 sequences and 51,089 base pairs, respectively. To the best of our knowledge, these are the largest datasets analyzed under ML to date. The capability to analyze such datasets will help to address novel biological questions via phylogenetic analyses. Our experimental results indicate that the fine-grained parallelization scales well up to 1, 024 processors. Moreover, a larger number of processors can be efficiently exploited by a combination of coarse-grained and fine-grained parallelism. Finally, we demonstrate that our parallelization scales equally well on an AMD Opteron cluster with a less favorable network latency to processor speed ratio. We recorded super-linear speedups in several cases due to increased cache efficiency.
Keywords :
Bridges; Cache memory; Delay; Drain avalanche hot carrier injection; Government; History; Large-scale systems; Performance gain; Phylogeny; Prefetching; IBM BlueGene/L; RAxML; maximum likelihood; phylogenetic inference;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Supercomputing, 2007. SC '07. Proceedings of the 2007 ACM/IEEE Conference on
Conference_Location :
Reno, NV, USA
Print_ISBN :
978-1-59593-764-3
Electronic_ISBN :
978-1-59593-764-3
Type :
conf
DOI :
10.1145/1362622.1362628
Filename :
5348861
Link To Document :
بازگشت