Title :
Message progression in parallel computing - to thread or not to thread?
Author :
Hoefler, Torsten ; Lumsdaine, Andrew
Author_Institution :
Open Syst. Lab., Indiana Univ., Bloomington, IN
fDate :
Sept. 29 2008-Oct. 1 2008
Abstract :
Message progression schemes that enable communication and computation to be overlapped have the potential to improve the performance of parallel applications. With currently available high-performance networks there are several options for making progress: manual progression, use of a progress thread, and communication offload. In this paper we analyze threaded progression approaches, comparing the effects of using shared or dedicated CPU cores for progression. To perform these comparisons, we propose time-based and work-based benchmark schemes. As expected, threaded progression performs well when a spare core is available to be dedicated to communication progression, but a number of operating system effects prevent the same benefits from being obtained when communication progress must share a core with computation. We show that some limited performance improvement can be obtained in the shared-core case by real-time scheduling of the progress thread.
Keywords :
parallel processing; scheduling; CPU cores; communication offloads; high-performance networks; message progression schemes; operating system; parallel applications; parallel computing; real-time scheduling; work-based benchmark schemes; Communication standards; Hardware; Intelligent networks; Libraries; Middleware; Operating systems; Parallel processing; Protocols; Testing; Yarn;
Conference_Titel :
Cluster Computing, 2008 IEEE International Conference on
Conference_Location :
Tsukuba
Print_ISBN :
978-1-4244-2639-3
Electronic_ISBN :
1552-5244
DOI :
10.1109/CLUSTR.2008.4663774