DocumentCode
451156
Title
Adaptive Two-level Thread Management for Fast MPI Execution on Shared Memory Machines
Author
Shen, Kai ; Tang, Hong ; Yang, Tao
Author_Institution
University of California, Santa Barbara
fYear
1999
fDate
13-18 Nov. 1999
Firstpage
49
Lastpage
49
Abstract
This paper addresses performance portability of MPI code on multiprogrammed shared memory machines. Conventional MPI implementations map each MPI node to an OS process, which suffers severe performance degradation in multiprogrammed environments. Our previous work (TMPI) has developed compile/run-time techniques to support threaded MPI execution by mapping each MPI node to a kernel thread. However, kernel threads have context switch cost higher than user-level threads and this leads to longer spinning time requirement during MPI synchronization. This paper presents an adaptive two-level thread scheme for MPI to reduce context switch and synchronization cost. This scheme also exposes thread scheduling information at user-level, which allows us to design an adaptive event waiting strategy to minimize CPU spinning and exploit cache affinity. Our experiments show that the MPI system based on the proposed techniques has great performance advantages over the previous version of TMPI and the SGI MPI implementation in multiprogrammed environments. The improvement ratio can reach as much as 161% or even more depending on the degree of multiprogramming.
Keywords
Costs; Degradation; Kernel; Memory management; Operating systems; Protocols; Runtime; Spinning; Switches; Yarn;
fLanguage
English
Publisher
ieee
Conference_Titel
Supercomputing, ACM/IEEE 1999 Conference
Print_ISBN
1-58113-091-0
Type
conf
DOI
10.1109/SC.1999.10061
Filename
1592692
Link To Document