DocumentCode
1486444
Title
Improving cache locality by a combination of loop and data transformations
Author
Kandemir, Mahmut ; Ramanujam, J. ; Choudhary, Alok
Author_Institution
Dept. of Electr. Eng. & Comput. Sci., Syracuse Univ., NY, USA
Volume
48
Issue
2
fYear
1999
fDate
2/1/1999 12:00:00 AM
Firstpage
159
Lastpage
167
Abstract
Exploiting locality of reference is key to realizing high levels of performance on modern processors. This paper describes a compiler algorithm for optimizing cache locality in scientific codes on uniprocessor and multiprocessor machines. A distinctive characteristic of our algorithm is that it considers loop and data layout transformations in a unified framework. Our approach is very effective at reducing cache misses and can optimize some nests for which optimization techniques based on loop transformations alone are not successful. An important special case is one in which data layouts of some arrays are fixed and cannot be changed. We show how our algorithm can accommodate this case and demonstrate how it can be used to optimize multiple loop nests. Experiments on several benchmarks show that the techniques presented in this paper result in substantial improvement in cache performance
Keywords
cache storage; optimising compilers; cache locality; cache performance; data layout transformations; loop transformations; multiple loop nests; optimization; reducing cache misses; Cache memory; Delay; Hardware; High performance computing; Optimizing compilers; Program processors; Programming profession; Protocols; Scheduling; Software performance;
fLanguage
English
Journal_Title
Computers, IEEE Transactions on
Publisher
ieee
ISSN
0018-9340
Type
jour
DOI
10.1109/12.752657
Filename
752657
Link To Document