Title :
Compiler-controlled caching in superword register files for multimedia extension architectures
Author :
Shin, Jaewook ; Chame, Jacqueline ; Hall, Mary W.
Author_Institution :
Inf. Sci. Inst., Univ. of Southern California, Marina del Rey, CA, USA
Abstract :
In this paper, we describe an algorithm and implementation of locality optimizations for architectures with instruction sets such as Intel´s SSE and Motorola´s AltiVec that support operations on superwords, i.e., aggregate objects consisting of several machine words. We treat the large superword register file as a compiler-controlled cache, thus avoiding unnecessary memory accesses by exploiting reuse in superword registers. This research is distinguished from previous work on exploiting reuse in scalar registers because it considers not only temporal but also spatial reuse. As compared to optimizations to exploit reuse in cache, the compiler must also manage replacement, and thus, explicitly name registers in the generated code. We describe an implementation of our approach integrated with a compiler that exploits superword-level parallelism (SLP). We present a set of results derived automatically on 4 multimedia kernels and 2 scientific benchmarks. Our results show speedups ranging from 1.3 to 2.8X on the 6 programs as compared to using SLP alone, and we eliminate the majority of memory accesses.
Keywords :
cache storage; instruction sets; multimedia computing; optimising compilers; parallel architectures; parallelising compilers; software performance evaluation; Intel SSE; Motorola AltiVec; compiler-controlled caching; explicit register naming; generated code; instruction sets; locality optimizations; memory accesses; multimedia extension architectures; multimedia kernels; replacement; scientific benchmarks; spatial reuse; speedups; superword register files; superword-level parallelism; superwords; Aggregates; Bandwidth; Computer aided instruction; Computer architecture; Delay; Instruction sets; Multimedia computing; Optimizing compilers; Parallel processing; Registers;
Conference_Titel :
Parallel Architectures and Compilation Techniques, 2002. Proceedings. 2002 International Conference on
Print_ISBN :
0-7695-1620-3
DOI :
10.1109/PACT.2002.1106003