• DocumentCode
    3298975
  • Title

    MPI-StarT: Delivering Network Performance to Numerical Applications

  • Author

    Husbands, Parry ; Hoe, James C.

  • Author_Institution
    Massachusetts Institute of Technology
  • fYear
    1998
  • fDate
    07-13 Nov. 1998
  • Firstpage
    17
  • Lastpage
    17
  • Abstract
    We describe an MPI implementation for a cluster of SMPs interconnected by a high- performance interconnect. This work is a collaboration between a numerical applications programmer and a cluster interconnect architect. The collaboration started with the modest goal of satisfying the communication needs of a specific numerical application, MITMatlab. However, by supporting the MPI standard MPI-StarT readily extends support to a host of applications. MPI-StarT is derived from MPICH by developing a custom implementation of the Channel Interface. Some changes in MPICH´s ADI and Protocol Layers are also necessary for correct and optimal operation. MPI-StarT relies on the host SMPs´ shared memory mechanism for intra-SMP communication. Inter-SMP communication is supported through StarT-X. The StarT-X NIU allows a cluster of PCI-equipped host platforms to communicate over the Arctic Switch Fabric. Currently, StarT-X is utilized by a cluster of SUN E5000 SMPs as well as a cluster of Intel Pentium-II workstations. On a SUN E5000 with StarT-X, a processor can send and receive a 64-byte message in less than 0.4 and 3.5 usec respectively and incur less than 5.6 usec user-to-user one-way latency. StarT-X´s remote memory-to-memory DMA mechanism can transfer large data blocks at 60 MByte/sec between SUN E5000s. This paper outlines our effort to preserve and deliver this level of communication performance through MPI-StarT to user applications. We have studied the requirements of MITMatlab and the capabilities of StarT-X and have formulated an implementation strategy for the Channel Interface. In this paper, we discuss some performance and correctness issues and their resolutions in MPI-StarT. The correctness issues range from the handling of arbitrarily large message sizes to deadlock-free support of nonblocking MPI operations. Performance optimizations include a shared-memory-based transport mechanism for intra- SMP communication and a broadcast mechanism that is aware of the performance difference between intra-SMP and the slower inter-SMP communication. We characterize the performance of MPI-StarT on a cluster of SUN E5000s. On SUN E5000s, MPI processes within the same SMP can communicate at over 150 MByte/sec using shared memory. When communicating between SMPs over S- tarT-X, MPI-StarT has a peak bandwidth of 56 MByte/sec. While fine-tuning of MPI-StarT is ongoing, we demonstrate that MPI-StarT is effective in enabling the speedup of MITMatlab on a cluster of SMPs by reporting on the performance of some representative numerical operations.
  • Keywords
    MITMatlab; MPI; MPICH; SMP; StarT-X; clustering; performance; Arctic; Collaborative work; Communication switching; Delay; Fabrics; Programming profession; Protocols; Sun; Switches; Workstations; MITMatlab; MPI; MPICH; SMP; StarT-X; clustering; performance;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Supercomputing, 1998.SC98. IEEE/ACM Conference on
  • Print_ISBN
    0-8186-8707-X
  • Type

    conf

  • DOI
    10.1109/SC.1998.10036
  • Filename
    1437304