Title :
Efficient loop partitioning for parallel codes of irregular scientific computations
Author :
Guo, Minyi ; Li, Li ; Chang, Weng-Long
Author_Institution :
Dept. of Comput. Software, Univ. of Aizu, Japan
Abstract :
In most distributed memory computations, node programs are executed on processors according to the owner computes rule. However, the owner computes rule is not best suited for irregular application codes. In irregular application codes, use of indirection in accessing left hand side array makes it difficult to partition loop iterations, and using indirection in accessing right hand side elements may reduce total communication by using heuristics other than the owner computes rule. In this paper we propose a communication cost reduction computes rule for irregular loop partitioning, called least communication computes rule. We partition a loop iteration to a processor on which minimal communication cost is ensured when executing that iteration. After all iterations are partitioned into various processors, we give a global vs local data transformation rule, indirection array remapping and communication optimization methods. The experimental results show that, in most cases, our approaches achieved better performance than other loop partitioning rules.
Keywords :
communication complexity; distributed memory systems; natural sciences computing; parallel programming; processor scheduling; program control structures; communication cost reduction computes rule; communication optimization methods; distributed memory computations; efficient loop partitioning; global data transformation rule; heuristics; indirection array remapping; irregular loop partitioning; irregular scientific computations; least communication computes rule; left hand side array access; local data transformation rule; loop iterations; node programs; parallel codes; right hand side element access; total communication; Computational fluid dynamics; Computer architecture; Concurrent computing; Costs; Distributed computing; Memory management; Optimizing compilers; Power engineering computing; Runtime; Telecommunication computing;
Conference_Titel :
Algorithms and Architectures for Parallel Processing, 2002. Proceedings. Fifth International Conference on
Conference_Location :
Beijing, China
Print_ISBN :
0-7695-1512-6
DOI :
10.1109/ICAPP.2002.1173553