DocumentCode :
1995495
Title :
A Generic Vectorization Scheme and a GPU Kernel for the Phylogenetic Likelihood Library
Author :
Izquierdo-Carrasco, Fernando ; Alachiotis, Nikolaos ; Berger, Stephen ; Flouri, Tomas ; Pissis, Solon P. ; Stamatakis, Alexandros
Author_Institution :
Exelixis Lab. Sci. Comput. Group, Heidelberg Inst. for Theor. Studies, Heidelberg, Germany
fYear :
2013
fDate :
20-24 May 2013
Firstpage :
530
Lastpage :
538
Abstract :
Highly optimized library implementations for important scientific kernels can improve scientific productivity. To this end, we are currently developing the Phylogenetic Likelihood Library (PLL) that implements functions to compute and optimize the phylogenetic likelihood score on evolutionary trees. Here, we focus on novel techniques to orchestrate likelihood computations on large vector-like processors such as GPUs. We present a novel scheme for vectorizing computations and organizing conditional likelihood arrays (CLAs) in such a way that they do not need to be transferred at all between the GPU and the CPU. We compare the performance of our GPU implementation for DNA data with a highly optimized x86 version of the PLL that relies on manually tuned AVX intrinsics. Our GPU implementation accelerates the likelihood computations by a factor of two compared to the, most probably, currently fastest available x86 implementation. We conclude that, a hybrid GPU-CPU version needs to be developed and integrated into the PLL to leverage the computational power of modern desktop systems and clusters.
Keywords :
bioinformatics; evolution (biological); evolutionary computation; genetics; graphics processing units; instruction sets; maximum likelihood estimation; multiprocessing systems; software libraries; vector processor systems; AVX; CLA; GPU kernel; PLL; cluster; conditional likelihood array; evolutionary tree; hybrid CPU; modern desktop system; orchestrate likelihood computation; phylogenetic likelihood library; phylogenetic likelihood score; vector like processor; vectorization scheme; x86 implementation; DNA; Graphics processing units; Layout; Libraries; Optimization; Phase locked loops; Vectors; GPU; OpenCL; maximum likelihood; phylogenetics; vector intrinsics;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2013 IEEE 27th International
Conference_Location :
Cambridge, MA
Print_ISBN :
978-0-7695-4979-8
Type :
conf
DOI :
10.1109/IPDPSW.2013.103
Filename :
6650928
Link To Document :
بازگشت