• DocumentCode
    3206091
  • Title

    Reducing Fragmentation on Torus-Connected Supercomputers

  • Author

    Tang, Wei ; Lan, Zhiling ; Desai, Narayan ; Buettner, Daniel ; Yu, Yongen

  • Author_Institution
    Dept. of Comput. Sci., Illinois Inst. of Technol., Chicago, IL, USA
  • fYear
    2011
  • fDate
    16-20 May 2011
  • Firstpage
    828
  • Lastpage
    839
  • Abstract
    Torus-based networks are prevalent on leadership-class petascale systems, providing a good balance between network cost and performance. The major disadvantage of this network architecture is its susceptibility to fragmentation. Many studies have attempted to reduce resource fragmentation in this architecture. Although the approaches suggested can make good allocation decisions reducing fragmentation at job start time, none of them considers a job´s wall time, which can cause resource fragmentation when neighboring jobs do not complete closely. In this paper, we propose a wall time-aware job allocation strategy, which adjacently packs jobs that finish around the same time, in order to minimize resource fragmentation caused by job length, discrepancy. Event-driven simulations using real job traces from a production Blue Gene/P system at Argonne National Laboratory demonstrate that our wall time-aware strategy can effectively reduce system fragmentation and improve overall system performance.
  • Keywords
    computer architecture; discrete event simulation; mainframes; Blue Gene/P system; Torus connected supercomputer; event driven simulation; fragmentation reduction; leadership class petascale system; network architecture; resource fragmentation minimization; torus-based networks; wall time aware job allocation strategy; Cobalt; Laboratories; Partitioning algorithms; Resource management; Runtime; Scheduling; Three dimensional displays;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel & Distributed Processing Symposium (IPDPS), 2011 IEEE International
  • Conference_Location
    Anchorage, AK
  • ISSN
    1530-2075
  • Print_ISBN
    978-1-61284-372-8
  • Electronic_ISBN
    1530-2075
  • Type

    conf

  • DOI
    10.1109/IPDPS.2011.82
  • Filename
    6012892