• DocumentCode
    2223895
  • Title

    A new metric for robustness with application to job scheduling

  • Author

    England, Darin ; Weissman, Jon ; Sadagopan, Jayashree

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Minnesota Univ., Twin Cities, MN, USA
  • fYear
    2005
  • fDate
    24-27 July 2005
  • Firstpage
    135
  • Lastpage
    143
  • Abstract
    Scheduling strategies for parallel and distributed computing have mostly been oriented toward performance, while striving to achieve some notion of fairness. With the increase in size, complexity, and heterogeneity of today´s computing environments, we argue that, in addition to performance metrics, scheduling algorithms should be designed for robustness. That is, they should have the ability to maintain performance under a wide variety of operating conditions. Although robustness is easy to define, there are no widely used metrics for this property. To this end, we present a methodology for characterizing and measuring the robustness of a system to a specific disturbance. The methodology is easily applied to many types of computing systems and it does not require sophisticated mathematical models. To illustrate its use, we show three applications of our technique to job scheduling; one supporting a previous result with respect to backfilling, one examining overload control in a streaming video server, and one comparing two different scheduling strategies for a distributed network service. The last example also demonstrates how consideration of robustness leads to better system design as we were able to devise a new and effective scheduling heuristic.
  • Keywords
    parallel processing; resource allocation; scheduling; video servers; video signal processing; video streaming; backfilling; distributed computing; distributed network service; fairness; job scheduling; overload control; parallel computing; performance metrics; robustness metric; scheduling algorithm; streaming video server; Application software; Cities and towns; Computer science; Concurrent computing; Distributed computing; Mathematical model; Measurement; Processor scheduling; Robustness; Uncertainty;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Distributed Computing, 2005. HPDC-14. Proceedings. 14th IEEE International Symposium on
  • ISSN
    1082-8907
  • Print_ISBN
    0-7803-9037-7
  • Type

    conf

  • DOI
    10.1109/HPDC.2005.1520948
  • Filename
    1520948