• DocumentCode
    1913402
  • Title

    Acceleration of Data-Intensive Workflow Applications by Using File Access History

  • Author

    Horiuchi, Masaru ; Taura, Koichi

  • Author_Institution
    Univ. of Tokyo, Tokyo, Japan
  • fYear
    2012
  • fDate
    10-16 Nov. 2012
  • Firstpage
    157
  • Lastpage
    165
  • Abstract
    Data I/O has been one of major bottlenecks in the execution of data-intensive workflow applications. Appropriate task scheduling of a workflow can achieve high I/O throughput by reducing remote data accesses. However, most such task scheduling algorithms require the user to explicitly describe files to be accessed by each job, typically by stage-in/stage-out directives in job description, where such annotations are at best tedious and sometime impossible. Thus, a more automated mechanism is necessary. In this paper, we propose a method for predicting input/output files of each job without user-supplied annotations. It predicts I/O files by collecting file access history in a profiling run prior to the production run. We implemented the proposed method in a workflow system GXP Make and a distributed file system Mogami. We evaluate our system with two real workflow applications. Our data-aware job scheduler increases the ratio of local file accesses from 50% to 75% in one application and from 23% to 45% in the other. As a result, it reduces the makespan of the two applications by 2.5% and 7.5%, respectively.
  • Keywords
    distributed processing; file organisation; scheduling; workflow management software; GXP Make workflow system; Mogami distributed file system; data input-output; data-aware job scheduler; data-intensive workflow application; file access history; input-output throughput; local file access ratio; stage-in-stage-out directive; user-supplied annotation; workflow task scheduling;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion:
  • Conference_Location
    Salt Lake City, UT
  • Print_ISBN
    978-1-4673-6218-4
  • Type

    conf

  • DOI
    10.1109/SC.Companion.2012.31
  • Filename
    6495813