• DocumentCode
    950139
  • Title

    Algorithms for Integrated Routing and Scheduling for Aggregating Data from Distributed Resources on a Lambda Grid

  • Author

    Banerjee, Amitabha ; Feng, Wu-chun ; Ghosal, Dipak ; Mukherjee, Biswanath

  • Author_Institution
    California Davis Univ., Davis
  • Volume
    19
  • Issue
    1
  • fYear
    2008
  • Firstpage
    24
  • Lastpage
    34
  • Abstract
    In many e-science applications, there exists an important need to aggregate information from data repositories distributed around the world. In an effort to better link these resources in a unified manner, many lambda-grid networks, which provide end-to-end dedicated optical-circuit-switched connections, have been investigated. In this context, we consider the problem of aggregating files from distributed databases at a (grid) computing node over a lambda grid. The challenge is (1) to identify routes (that is, circuits) in the lambda-grid network, along which files should be transmitted, and (2) to schedule the transfers of these files over their respective circuits. To address this challenge, we propose a hybrid approach that combines offline and online scheduling. We define the Time-Path Scheduling Problem (TPSP) for offline scheduling. We prove that TPSP is NP-complete, develop a Mixed Integer Linear Program (MILP) formulation for TPSP, and then propose a greedy approach to solve TPSP because the MILP does not scale well. We compare the performance of the greedy approach on a few representative lambda-grid network topologies. One key input to the offline schedule is the file transfer time. Due to dynamics at the receiving end host, which is hard to model precisely, the actual file transfer time may vary. We first propose a model for estimating the file transfer time. Then, we propose online reconfiguration algorithms so that as files are transferred, the offline schedule may be modified online, depending on the amount of time that it actually took to transfer the file. This helps in reducing the total time to transfer all the files, which is an important metric. To demonstrate the effectiveness of our approach, we present results on an emulated lambda-grid network testbed.
  • Keywords
    distributed databases; greedy algorithms; grid computing; integer programming; linear programming; natural sciences computing; network routing; scheduling; NP-complete problem; data aggregation; distributed database; distributed resource; e-science application; end-to-end dedicated optical-circuit-switched connection; file transfer time estimation; greedy approach; integrated routing; lambda-grid network topology; mixed integer linear program; time-path scheduling problem; circuit switching; lambda grid; large scale data transfers; routing; scheduling;
  • fLanguage
    English
  • Journal_Title
    Parallel and Distributed Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9219
  • Type

    jour

  • DOI
    10.1109/TPDS.2007.1112
  • Filename
    4359400