Title :
A generalized basic-cycle calculation method for efficient array redistribution
Author :
Hsu, Ching-Hsien ; Bai, Sheng-Wen ; Chung, Yeh-Ching ; Yang, Chu-Sing
Author_Institution :
Dept. of Inf. Eng., Feng Chia Univ., Taichung, Taiwan
fDate :
12/1/2000 12:00:00 AM
Abstract :
In many scientific applications, dynamic array redistribution is usually required to enhance the performance of an algorithm. In this paper, we present a generalized basic-cycle calculation (GBCC) method to efficiently perform a BLOCK-CYCLIC(s) over P processors to BLOCK-CYCLIC(t) over Q processors array redistribution. In the GBCC method, a processor first computes the source/destination processor/data sets of array elements in the first generalized basic-cycle of the local array it owns. A generalized basic-cycle is defined as lcm(sP, tQ)/(gcd(s,t)×P) in the source distribution and lcm(sP, tQ)/(gcd(s,t)×Q) in the destination distribution. From the source/destination processor/data sets of array elements in the first generalized basic-cycle, we can construct packing/unpacking pattern tables to minimize the data-movement operations. Since each generalized basic-cycle has the same communication pattern, based on the packing/unpacking pattern tables, a processor can pack/unpack array elements efficiently. To evaluate the performance of the GBCC method, we have implemented this method on an IBM SP2 parallel machine, along with the PITFALLS method and the ScaLAPACK method. The cost models for these three methods are also presented. The experimental results show that the GBCC method outperforms the PITFALLS method and the ScaLAPACK method for all test samples. A brief description of the extension of the GBCC method to multidimensional array redistributions is also presented
Keywords :
distributed memory systems; parallel processing; performance evaluation; IBM SP2 parallel machine; PITFALLS method; ScaLAPACK method; array elements; communication pattern; data-movement operations; efficient array redistribution; generalized basic-cycle calculation method; multidimensional array redistributions; performance enhancement; source distribution; Application software; Computer Society; Computer languages; Costs; Multidimensional systems; Parallel machines; Parallel programming; Phased arrays; Program processors; Testing;
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on