Title :
Load-balanced pipeline parallelism
Author :
Kamruzzaman, Md ; Swanson, Stephen ; Tullsen, Dean M.
Author_Institution :
Comput. Sci. & Eng., Univ. of California, San Diego, La Jolla, CA, USA
Abstract :
Accelerating a single thread in current parallel systems remains a challenging problem, because sequential threads do not naturally take advantage of the additional cores. Recent work shows that automatic extraction of pipeline parallelism is an effective way to speed up single thread execution. However, two problems remain challenging - load balancing and inter-thread communication. This work shows new mechanism to exploit pipeline parallelism that naturally solves the load balancing and communication problems. This compiler-based technique automatically extracts the pipeline stages and executes them in a data parallel fashion, using token-based chunked synchronization to handle sequential stages. This technique provides linear speedup for several applications, and outperforms prior techniques to exploit pipeline parallelism by as much as 50%.
Keywords :
multiprocessing systems; parallel processing; pipeline processing; program compilers; resource allocation; synchronisation; compiler-based technique; data parallel fashion; inter-thread communication; load-balanced pipeline parallelism; parallel systems; pipeline stage automatic extraction; sequential stages; sequential threads; single thread acceleration; token-based chunked synchronization; Instruction sets; Load management; Pipeline processing; Pipelines; Synchronization; chip multiprocessors; compilers; load-balancing; locality; pipeline parallelism;
Conference_Titel :
High Performance Computing, Networking, Storage and Analysis (SC), 2013 International Conference for
Conference_Location :
Denver, CO
Print_ISBN :
978-1-4503-2378-9
DOI :
10.1145/2503210.2503295