Title :
Distribution assignment placement: effective optimization of redistribution costs
Author :
Knoop, Jens ; Mehofer, Eduard
Author_Institution :
Dept. of Comput. Sci., Dortmund Univ., Germany
fDate :
6/1/2002 12:00:00 AM
Abstract :
Data locality and workload balance are key factors for getting high performance out of data-parallel programs on multiprocessor architectures. Data-parallel languages such as High-Performance Fortran (HPF) thus offer means allowing a programmer both to specify data distributions and to change them dynamically in order to maintain these properties. On the other hand, redistributions can be quite expensive and can significantly degrade a program´s performance. They must thus be reduced to a minimum. In this article, we present a novel, aggressive approach for avoiding unnecessary remappings, which works by eliminating partially dead and partially redundant distribution changes. Basically, this approach evolves from extending and combining two algorithms for these optimizations, each achieving optimal results on its own. In distinction to the sequential setting, the data-parallel setting leads naturally to a family of algorithms of varying power and efficiency, allowing requirement-customized solutions. The power and flexibility of the new approach are demonstrated by various examples, which range from typical HPF fragments to real-world programs. Performance measurements underline its importance and show its effectiveness on different hardware platforms and in different settings
Keywords :
parallel architectures; parallel languages; parallel programming; resource allocation; software performance evaluation; HPF fragments; High-Performance Fortran; algorithm efficiency; algorithm power; assignment elimination; data distribution specification; data flow analysis; data locality; data-parallel languages; data-parallel program performance; distribution assignment placement; dynamic data redistribution; hardware platforms; multiprocessor architectures; partially dead distribution changes; partially redundant distribution changes; performance measurements; redistribution cost optimization; requirement-customized solutions; unnecessary remappings; workload balance; Computer Society; Computer architecture; Cost function; Data analysis; Degradation; Hardware; Measurement; Optimizing compilers; Program processors; Programming profession;
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on
DOI :
10.1109/TPDS.2002.1011416