DocumentCode :
1486444
Title :
Improving cache locality by a combination of loop and data transformations
Author :
Kandemir, Mahmut ; Ramanujam, J. ; Choudhary, Alok
Author_Institution :
Dept. of Electr. Eng. & Comput. Sci., Syracuse Univ., NY, USA
Volume :
48
Issue :
2
fYear :
1999
fDate :
2/1/1999 12:00:00 AM
Firstpage :
159
Lastpage :
167
Abstract :
Exploiting locality of reference is key to realizing high levels of performance on modern processors. This paper describes a compiler algorithm for optimizing cache locality in scientific codes on uniprocessor and multiprocessor machines. A distinctive characteristic of our algorithm is that it considers loop and data layout transformations in a unified framework. Our approach is very effective at reducing cache misses and can optimize some nests for which optimization techniques based on loop transformations alone are not successful. An important special case is one in which data layouts of some arrays are fixed and cannot be changed. We show how our algorithm can accommodate this case and demonstrate how it can be used to optimize multiple loop nests. Experiments on several benchmarks show that the techniques presented in this paper result in substantial improvement in cache performance
Keywords :
cache storage; optimising compilers; cache locality; cache performance; data layout transformations; loop transformations; multiple loop nests; optimization; reducing cache misses; Cache memory; Delay; Hardware; High performance computing; Optimizing compilers; Program processors; Programming profession; Protocols; Scheduling; Software performance;
fLanguage :
English
Journal_Title :
Computers, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9340
Type :
jour
DOI :
10.1109/12.752657
Filename :
752657
Link To Document :
بازگشت