• DocumentCode
    451282
  • Title

    Scalable NIC-based Reduction on Large-scale Clusters

  • Author

    Moody, Adam ; Fernandez, Juan ; Petrini, Fabrizio ; Panda, Dhabaleswar K.

  • Author_Institution
    The Ohio State University, Columbus
  • fYear
    2003
  • fDate
    15-21 Nov. 2003
  • Firstpage
    59
  • Lastpage
    59
  • Abstract
    Many parallel algorithms require efficient reduction collectives. In response, researchers have designed algorithms considering a range of parameters including data size, system size, and communication characteristics. Throughout this past work, however, processing was limited to the host CPU. Today, modern Network Interface Cards (NICs) sport programmable processors with substantial memory, and thus introduce a fresh variable into the equation. In this paper, we investigate this new option in the context of large-scale clusters. Through experiments on the 960-node, 1920-processor ASCI Linux Cluster (ALC) at Lawrence Livermore National Laboratory, we show that NIC-based reductions outperform host-based algorithms in terms of reduced latency and increased consistency. In particular, in the largest configuration tested - 1812 processors - our NIC-based algorithm summed single-element vectors of 32-bit integers and 64-bit floating-point numbers in 73 µs and 118 µs, respectively. These results represent respective improvements of 121% and 39% over the production-level MPI library.
  • Keywords
    Algorithm design and analysis; Automatic logic units; Clustering algorithms; Context; Equations; Laboratories; Large-scale systems; Linux; Network interfaces; Parallel algorithms;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Supercomputing, 2003 ACM/IEEE Conference
  • Print_ISBN
    1-58113-695-1
  • Type

    conf

  • DOI
    10.1109/SC.2003.10051
  • Filename
    1592962