Title :
Memory bank disambiguation using modulo unrolling for Raw machines
Author :
Barua, Rajeev ; Lee, Walter ; Amarasinghe, Saman ; Agarwal, Anant
Author_Institution :
Lab. for Comput. Sci., MIT, Cambridge, MA, USA
Abstract :
We present modulo unrolling, a code transformation technique for enabling array references to be accessed through the fast static network on a Raw machine. A Raw machine comprises of a mesh of simple, replicated tiles connected by an interconnect which supports fast, static near-neighbor communication. Like all other resources, memory is distributed across the tiles. Management of the memory can be performed by well known techniques which generate the requisite communication code on distributed address-space architectures. On the other hand, the fast, static network provides the compiler with a simple interface to optimize such communication. This paper addresses the problem of taking advantage of such static communication for memory accesses. The requirement for static memory communication is the compile-time knowledge of the exact communication required for each memory reference. This knowledge, in turn, can be obtained if a memory reference refers exclusively to memory residing on a single processing tile. We introduce modulo unrolling as a technique which allows the static communication of a large class of array accesses. We show how this technique achieves the goal of static communication by using a relatively small unroll factor. For a set of dense matrix scientific applications, we are able to access all the array references on the static network, enabling scalable speedups on the Raw machine
Keywords :
distributed memory systems; optimising compilers; parallel programming; software performance evaluation; storage management; Raw machines; array references; code transformation technique; dense matrix scientific applications; distributed address-space architectures; distributed memory; memory bank disambiguation; memory management; memory reference; modulo unrolling; optimizing compiler; replicated tiles; static communication; static near-neighbor communication; Bandwidth; Computer science; Costs; Electronic switching systems; Laboratories; Memory management; Optimizing compilers; Registers; Tiles; World Wide Web;
Conference_Titel :
High Performance Computing, 1998. HIPC '98. 5th International Conference On
Conference_Location :
Madras
Print_ISBN :
0-8186-9194-8
DOI :
10.1109/HIPC.1998.737991