• DocumentCode
    3678377
  • Title

    Balancing Thread-Level and Task-Level Parallelism for Data-Intensive Workloads on Clusters and Clouds

  • Author

    Olivia Choudhury;Dinesh Rajan;Nicholas Hazekamp;Sandra Gesing;Douglas Thain;Scott Emrich

  • Author_Institution
    Dept. of Comput. Sci. &
  • fYear
    2015
  • Firstpage
    390
  • Lastpage
    393
  • Abstract
    The runtime configuration of parallel and distributed applications remains a mysterious art. To tune an application on a particular system, the end-user must choose the number of machines, the number of cores per task, the data partitioning strategy, and so on, all of which result in a combinatorial explosion of choices. While one might try to exhaustively evaluate all choices in search of the optimal, the end user´s goal is simply to run the application once with reasonable performance by avoiding terrible configurations. To address this problem, we present a hybrid technique based on regression models for tuning data intensive bioinformatics applications: the sequential computational kernel is characterized empirically and then incorporated into an ab initio model of the distributed system. We demonstrate this technique on the commonly-used applications BWA, Bowtie2, and BLASR and validate the accuracy of our proposed models on clouds and clusters.
  • Keywords
    "Computational modeling","Data models","Bioinformatics","Instruction sets","Predictive models","Parallel processing","Genomics"
  • Publisher
    ieee
  • Conference_Titel
    Cluster Computing (CLUSTER), 2015 IEEE International Conference on
  • Type

    conf

  • DOI
    10.1109/CLUSTER.2015.60
  • Filename
    7307607