DocumentCode :
1783294
Title :
Victim Selection and Distributed Work Stealing Performance: A Case Study
Author :
Perarnau, Swann ; Sato, Mitsuhisa
Author_Institution :
RIKEN, AICS, Kobe, Japan
fYear :
2014
fDate :
19-23 May 2014
Firstpage :
659
Lastpage :
668
Abstract :
Work stealing is a popular solution to perform dynamic load balancing of irregular computations, both for shared memory and distributed memory systems. While shared memory performance of work stealing is well understood, distributing this algorithm to several thousands of nodes can introduce new performance issues. In particular, most studies of work stealing assume that all participating processes are equidistant from each other, in terms of communication latency. This paper presents a new performance evaluation of the popular UTS benchmark, in its work stealing implementation, on the scale of ten thousands of compute nodes. Taking advantage of the physical scale of the K Computer, we investigate in details the performance impact of communication latencies on work stealing. In particular, we introduce a new performance metric to assess the time needed by the work stealing scheduler to distribute work among all processes. Using this metric, we identify a previously overlooked issue: the victim selection function used by the work stealing application can severely impact its performance at large scale. To solve this issue, we introduce a new strategy taking into account the physical distance between nodes and achieve significant performance improvements.
Keywords :
distributed memory systems; performance evaluation; resource allocation; shared memory systems; K computer; UTS benchmark; communication latency; distributed memory systems; distributed work stealing performance; dynamic load balancing; performance evaluation; performance metric; shared memory systems; victim selection function; work stealing scheduler; Benchmark testing; Blades; Computers; Load management; Measurement; Memory management; Resource management; distributed load balancing; latency; work stealing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing Symposium, 2014 IEEE 28th International
Conference_Location :
Phoenix, AZ
ISSN :
1530-2075
Print_ISBN :
978-1-4799-3799-8
Type :
conf
DOI :
10.1109/IPDPS.2014.74
Filename :
6877298
Link To Document :
بازگشت