Title :
Array operation synthesis to optimize HPF programs
Author :
Hwang, Gwan-Hwan ; Lee, Jenq Kuen ; Ju, Dz-ching R.
Author_Institution :
Dept. of Comput. Sci., Nat. Tsing Hua Univ., Hsinchu, Taiwan
Abstract :
The synthesis of consecutive array operations or array expressions into a composite access function of the source arrays at compile time can reduce the redundant data movement, temporary storage usage and loop synchronization overhead on flat shared-memory parallel machines with uniform memory accesses. However, it remains open how the synthesis scheme can be incorporated into optimizing HPF (High-Performance Fortran) programs on distributed-memory machines by taking into account communication costs. In this paper, we propose solutions to address this open problem. We first apply the array operation synthesis (developed for Fortran 90 programs) to HPF programs and demonstrate its performance benefits on distributed-memory machines. In addition, to prevent a situation we call the “synthesis performance anomaly”, we derive a cost model and present an optimal solution based on this model to guide the array synthesis process on distributed-memory machines. We also show that the optimal problem is NP-hard. Therefore, we develop a practical heuristic algorithm for compilers to devise a synthesis strategy on distributed-memory machines with HPF programs. Experimental results show a significant performance improvement over the base codes for HPF code fragments from real applications on a DEC Alpha processor farm by incorporating our proposed optimizations
Keywords :
DEC computers; FORTRAN; computational complexity; distributed memory systems; heuristic programming; optimisation; parallel languages; parallel programming; software performance evaluation; systolic arrays; DEC Alpha processor farm; HPF program optimization; High-Performance Fortran; NP-hard problem; array expressions; array operation synthesis; code fragments; communication cost model; compile-time synthesis; composite access function; consecutive array operations; distributed-memory machines; flat shared-memory parallel machines; heuristic algorithm; loop synchronization overhead; performance improvement; redundant data movement; synthesis performance anomaly; temporary storage usage; uniform memory accesses; Computer languages; Computer science; Cost function; Heuristic algorithms; Mathematics; Parallel machines; Program processors; Tail;
Conference_Titel :
Parallel Processing, 1996. Vol.3. Software., Proceedings of the 1996 International Conference on
Conference_Location :
Ithaca, NY
Print_ISBN :
0-8186-7623-X
DOI :
10.1109/ICPP.1996.538553