DocumentCode
3687133
Title
Automatic cluster parallelization and minimizing communication via selective data replication
Author
Sanket Tavarageri;Benoît Meister;Muthu Baskaran;Benoît Pradelle;Tom Henretty;Athanasios Konstantinidis;Ann Johnson;Richard Lethin
Author_Institution
Reservoir Labs, 632 Broadway, New York, 10012, United States
fYear
2015
Firstpage
1
Lastpage
7
Abstract
The technology scaling has initiated two distinct trends that are likely to continue into future: first, the increased parallelism in hardware and second, the increasing performance and energy cost of communication relative to computation. Both of the above trends call for development of compiler and runtime systems to automatically parallelize programs and reduce communication in parallel computations to achieve the desired high performance in an energy-efficient fashion. In this paper, we propose the design of an integrated compiler and runtime system that auto-parallelizes loop-nests to clusters and, a novel communication avoidance method that reduces data movement between processors. Communication minimization is achieved via data replication: data is replicated so that a larger share of the whole data set may be mapped to a processor and hence, non-local memory accesses reduced. Experiments on a number of benchmarks show the effectiveness of the approach.
Keywords
"Arrays","Program processors","Runtime","Benchmark testing","Minimization","Memory management","Complexity theory"
Publisher
ieee
Conference_Titel
High Performance Extreme Computing Conference (HPEC), 2015 IEEE
Type
conf
DOI
10.1109/HPEC.2015.7322481
Filename
7322481
Link To Document