• DocumentCode
    560208
  • Title

    A long-distance infiniband interconnection between two clusters in production use

  • Author

    Richling, Sabine ; Kredel, Heinz ; Hau, Steffen ; Kruse, Hans-Günther

  • Author_Institution
    IT-Center, Univ. of Heidelberg, Heidelberg, Germany
  • fYear
    2011
  • fDate
    12-18 Nov. 2011
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    We discuss operational and organizational issues of an InfiniBand interconnection between two clusters over a distance of 28 km in day-to-day production use. We describe the setup of hardware and networking components, and the solution of technical integration problems. Then we present solutions for a federated authorization system for the cluster within our two participating universities and other organizational integration problems. Performance measurements for MPI communication and file access to Lustre storage systems are presented. The results and a simple performance model show that MPI performance is intrinsically poor across the long-distance interconnection with limited bandwidth. However, file access and MPI communication among nodes on each side are barely affected by the limitations of the interconnection even at high load. Our organizational and technical setup allows the operation of the two clusters as a single system with lower administration costs and a better load balance than in a disconnected setup.
  • Keywords
    LAN interconnection; application program interfaces; authorisation; computer network performance evaluation; computer network security; message passing; optical fibre LAN; software performance evaluation; storage management; workstation clusters; Lustre storage system; MPI communication; federated authorization system; file access; hardware components; long-distance InfiniBand interconnection; networking components; operational issues; organizational integration problems; organizational issues; performance measurements; technical integration problems; Bandwidth; Blades; Educational institutions; Optics; Servers; Software; long-distance InfiniBand; operating clusters; performance model;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing, Networking, Storage and Analysis (SC), 2011 International Conference for
  • Conference_Location
    Seatle, WA
  • Electronic_ISBN
    978-1-4503-0771-0
  • Type

    conf

  • Filename
    6114476