DocumentCode :
560208
Title :
A long-distance infiniband interconnection between two clusters in production use
Author :
Richling, Sabine ; Kredel, Heinz ; Hau, Steffen ; Kruse, Hans-Günther
Author_Institution :
IT-Center, Univ. of Heidelberg, Heidelberg, Germany
fYear :
2011
fDate :
12-18 Nov. 2011
Firstpage :
1
Lastpage :
8
Abstract :
We discuss operational and organizational issues of an InfiniBand interconnection between two clusters over a distance of 28 km in day-to-day production use. We describe the setup of hardware and networking components, and the solution of technical integration problems. Then we present solutions for a federated authorization system for the cluster within our two participating universities and other organizational integration problems. Performance measurements for MPI communication and file access to Lustre storage systems are presented. The results and a simple performance model show that MPI performance is intrinsically poor across the long-distance interconnection with limited bandwidth. However, file access and MPI communication among nodes on each side are barely affected by the limitations of the interconnection even at high load. Our organizational and technical setup allows the operation of the two clusters as a single system with lower administration costs and a better load balance than in a disconnected setup.
Keywords :
LAN interconnection; application program interfaces; authorisation; computer network performance evaluation; computer network security; message passing; optical fibre LAN; software performance evaluation; storage management; workstation clusters; Lustre storage system; MPI communication; federated authorization system; file access; hardware components; long-distance InfiniBand interconnection; networking components; operational issues; organizational integration problems; organizational issues; performance measurements; technical integration problems; Bandwidth; Blades; Educational institutions; Optics; Servers; Software; long-distance InfiniBand; operating clusters; performance model;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing, Networking, Storage and Analysis (SC), 2011 International Conference for
Conference_Location :
Seatle, WA
Electronic_ISBN :
978-1-4503-0771-0
Type :
conf
Filename :
6114476
Link To Document :
بازگشت