• DocumentCode
    653976
  • Title

    Balanced Task Clustering in Scientific Workflows

  • Author

    Weiwei Chen ; Da Silva, Rafael Ferreira ; Deelman, Ewa ; Sakellariou, Rizos

  • Author_Institution
    Inf. Sci. Inst., Univ. of Southern California, Marina, CA, USA
  • fYear
    2013
  • fDate
    22-25 Oct. 2013
  • Firstpage
    188
  • Lastpage
    195
  • Abstract
    Scientific workflows can be composed of many fine computational granularity tasks. The runtime of these tasks may be shorter than the duration of system overheads, for example, when using multiple resources of a cloud infrastructure. Task clustering is a runtime optimization technique that merges multiple short tasks into a single job such that the scheduling overhead is reduced and the overall runtime performance is improved. However, existing task clustering strategies only provide a coarse-grained approach that relies on an over-simplified workflow model. In our work, we examine the reasons that cause Runtime Imbalance and Dependency Imbalance in task clustering. Next, we propose quantitative metrics to evaluate the severity of the two imbalance problems respectively. Furthermore, we propose a series of task balancing methods to address these imbalance problems. Finally, we analyze their relationship with the performance of these task balancing methods. A trace-based simulation shows our methods can significantly improve the runtime performance of two widely used workflows compared to the actual implementation of task clustering.
  • Keywords
    natural sciences computing; pattern clustering; balanced task clustering; dependency imbalance; runtime imbalance; scientific workflows; task balancing methods; task clustering; trace-based simulation shows; Clustering algorithms; Delays; Educational institutions; Engines; Optimization; Runtime; Scientific workflow; data locality; load balance; task clustering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    eScience (eScience), 2013 IEEE 9th International Conference on
  • Conference_Location
    Beijing
  • Type

    conf

  • DOI
    10.1109/eScience.2013.40
  • Filename
    6683907