Title :
Optimizing Data Locality for Fork/Join Programs Using Constrained Work Stealing
Author :
Lifflander, Jonathan ; Krishnamoorthy, Sriram ; Kale, Laxmikant V.
Author_Institution :
Dept. of Comput. Sci., Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA
Abstract :
We present an approach to improving data locality across different phases of fork/join programs scheduled using work stealing. The approach consists of: (1) user-specified and automated approaches to constructing a steal tree, the schedule of steal operations, and (2) constrained work-stealing algorithms that constrain the actions of the scheduler to mirror a given steal tree. These are combined to construct work-stealing schedules that maximize data locality across computation phases while ensuring load balance within each phase. These algorithms are also used to demonstrate dynamic coarsening, an optimization to improve spatial locality and sequential overheads by combining many finer-grained tasks into coarser tasks while ensuring sufficient concurrency for locality-optimized load balance. Implementation and evaluation in Cilk demonstrate performance improvements of up to 2.5x on 80 cores. We also demonstrate that dynamic coarsening can combine the performance benefits of coarse task specification with the adaptability of finer tasks.
Keywords :
concurrency control; parallel programming; processor scheduling; resource allocation; tree data structures; Cilk; automated approach; coarse task specification; coarser tasks; computation phases; concurrency; constrained work stealing; constrained work-stealing algorithms; constraint scheduler actions; data locality improvement; data locality maximization; data locality optimization; dynamic coarsening; fine-grained task adaptability; fork/join program scheduling; locality-optimized load balancing; performance improvements; sequential overhead improvement; spatial locality improvement; steal operation scheduling; steal tree construction; user-specified approach; work-stealing scheduling construction; Heuristic algorithms; Kernel; Optimization; Parallel processing; Runtime; Schedules; Synchronization; cilk; data locality; fork/join; task granularity;
Conference_Titel :
High Performance Computing, Networking, Storage and Analysis, SC14: International Conference for
Conference_Location :
New Orleans, LA
Print_ISBN :
978-1-4799-5499-5