• DocumentCode
    1494565
  • Title

    Performance-based path determination for interprocessor communication in distributed computing systems

  • Author

    Kim, JunSeong ; Lilja, David J.

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Minnesota Univ., Minneapolis, MN, USA
  • Volume
    10
  • Issue
    3
  • fYear
    1999
  • fDate
    3/1/1999 12:00:00 AM
  • Firstpage
    316
  • Lastpage
    327
  • Abstract
    The different types of messages used by a parallel application program executing in a distributed computing system can each have unique characteristics so that no single communication network can produce the lowest latency for all messages. For instance, short control messages may be sent with the lowest overhead on one type of network, such as Ethernet, while bulk data transfers may be better suited to a different type of network, such as Fibre Channel or HIPPI. This work investigates how to exploit multiple heterogeneous communication networks that interconnect the same set of processing nodes using a set of techniques we call performance-based path determination (PBPD). The performance-based path selection (PBPS) technique selects the best (lowest latency) network among several for each individual message to reduce the communication overhead of parallel programs. The performance-based path aggregation (PBPA) technique, on the other hand, aggregates multiple networks into a single virtual network to increase the available bandwidth. We test the PBPD techniques on a cluster of SGI multiprocessors interconnected with Ethernet, Fibre Channel, and HiPPI networks using a custom communication library built on top of the TCP/IP protocol layers. We find that PBPS can reduce communication overhead in applications compared to using either network alone, while aggregating networks into a single virtual network can reduce communication latency for bandwidth-limited applications. The performance of the PBPD techniques depends on the mix of message sizes in the application program and the relative overheads of the networks, as demonstrated in our analytical models
  • Keywords
    local area networks; network interfaces; performance evaluation; transport protocols; workstation clusters; Ethernet; Fibre Channel; HIPPI; SGI multiprocessors; TCP/IP protocol layers; analytical models; bulk data transfers; custom communication library; distributed computing systems; interprocessor communication; multiple heterogeneous communication networks; parallel application program; parallel programs; performance-based path determination; performance-based path selection; short control messages; virtual network; Aggregates; Bandwidth; Communication networks; Communication system control; Delay; Distributed computing; Ethernet networks; Libraries; Optical fiber communication; Optical fiber testing;
  • fLanguage
    English
  • Journal_Title
    Parallel and Distributed Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9219
  • Type

    jour

  • DOI
    10.1109/71.755832
  • Filename
    755832