DocumentCode
1920583
Title
Added Concurrency to Improve MPI Performance on Multicore
Author
Kamal, Humaira ; Wagner, Alan
Author_Institution
Dept. of Comput. Sci., Univ. of British Columbia, Vancouver, BC, Canada
fYear
2012
fDate
10-13 Sept. 2012
Firstpage
229
Lastpage
238
Abstract
MPI implementations typically equate an MPI process with an OS-process, resulting in a coarse-grain programming model where MPI processes are bound to the physical cores. Fine-Grain (FG-MPI) extends the MPICH2 implementation of MPI and implements an integrated runtime system to allow multiple MPI processes to execute concurrently inside an OS-process. FG-MPI´s integrated approach makes it possible to add more concurrency than available parallelism, while minimizing the overheads related to context switches, scheduling and synchronization. In this paper we evaluate the benefits of added concurrency for cache awareness and message size and show that performance gains are possible by using FG-MPI to adjust the grain-size of a program to better fit the cache and potential advantages in passing smaller versus larger messages. We evaluate the use of FG-MPI on the complete set of the NAS parallel benchmarks over large problem sizes, where we show significant performance improvement (20%-30%) for three of the eight benchmarks. We discuss the characteristics of the benchmarks with regards to trade-offs between the added costs and benefits.
Keywords
cache storage; message passing; multiprocessing systems; operating systems (computers); parallel memories; processor scheduling; synchronisation; FG-MPI integrated approach; MPI; MPICH2 implementation; NAS parallel benchmark; OS process; cache awareness; coarse grain programming model; concurrency; context switch; fine grain MPI; integrated runtime system; message size; multicore processor; scheduling; synchronization; Benchmark testing; Concurrent computing; Context; Middleware; Multicore processing; Runtime; Switches; Concurrency; Fine-Grain MPI; MPICH2; Message Passing; Multicore; Over-decomposition; Performance;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel Processing (ICPP), 2012 41st International Conference on
Conference_Location
Pittsburgh, PA
ISSN
0190-3918
Print_ISBN
978-1-4673-2508-0
Type
conf
DOI
10.1109/ICPP.2012.15
Filename
6337584
Link To Document