Title :
Abstract: Slack-Conscious Lightweight Loop Scheduling for Improving Scalability of Bulk-synchronous MPI Applications
Author :
Kale, Vivek ; Gamblin, Todd ; Hoefler, Torsten ; de Supinski, Bronis R. ; Gropp, William D.
Abstract :
Due to the strict communication dependences in the global collective communication of MPI applications, noise that delays one process can amplify across processes in a large run. The amount of overhead that noise amplification causes can increase dramatically as we scale the application to a very large numbers of processes (10,000 or more). For hybrid OpenMP/MPI (or MPI+X) applications, we can reduce noise amplification with on- node dynamic thread scheduling. However, the cost of dequeue overhead in such schemes can be steep. To mitigate this cost, we have introduced lightweight scheduling, which combines dynamic and static task scheduling to reduce the total number of dequeue operations while still absorbing noise. Our scheme allows for portability and performance consistency, without reducing the absolute performance of the application. In this work, we reduce the overhead of our scheme further by carefully using more static scheduling when we know that noise will not be amplified. We exploit a priori knowledge of per-process MPI slack to reduce the static fraction for those MPI processes that are known not to be on the critical path and thus likely not to amplify noise. We find that this technique gives an 11% performance gain over the original lightweight scheduling (17% gain over OpenMP static scheduling) when we run an algebraic multi-grid application on up to 16,384 process runs (1024 nodes) of a NUMA cluster, and are able to project further performance gains on machines with node counts beyond 10,000.
Keywords :
Dynamic Scheduling; MPI; Noise Amplification; OpenMP; Performance Optimization; Slack; System Noise;
Conference_Titel :
High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion:
Conference_Location :
Salt Lake City, UT
Print_ISBN :
978-1-4673-6218-4
DOI :
10.1109/SC.Companion.2012.209