Title :
Balancing Thread-Level and Task-Level Parallelism for Data-Intensive Workloads on Clusters and Clouds
Author :
Olivia Choudhury;Dinesh Rajan;Nicholas Hazekamp;Sandra Gesing;Douglas Thain;Scott Emrich
Author_Institution :
Dept. of Comput. Sci. &
Abstract :
The runtime configuration of parallel and distributed applications remains a mysterious art. To tune an application on a particular system, the end-user must choose the number of machines, the number of cores per task, the data partitioning strategy, and so on, all of which result in a combinatorial explosion of choices. While one might try to exhaustively evaluate all choices in search of the optimal, the end user´s goal is simply to run the application once with reasonable performance by avoiding terrible configurations. To address this problem, we present a hybrid technique based on regression models for tuning data intensive bioinformatics applications: the sequential computational kernel is characterized empirically and then incorporated into an ab initio model of the distributed system. We demonstrate this technique on the commonly-used applications BWA, Bowtie2, and BLASR and validate the accuracy of our proposed models on clouds and clusters.
Keywords :
"Computational modeling","Data models","Bioinformatics","Instruction sets","Predictive models","Parallel processing","Genomics"
Conference_Titel :
Cluster Computing (CLUSTER), 2015 IEEE International Conference on
DOI :
10.1109/CLUSTER.2015.60