Title :
Cache-oblivious algorithms
Author :
Frigo, Matteo ; Leiserson, Charles E. ; Prokop, Harald ; Ramachandran, Sridhar
Author_Institution :
Lab. for Comput. Sci., MIT, Cambridge, MA, USA
Abstract :
This paper presents asymptotically optimal algorithms for rectangular matrix transpose, FFT, and sorting on computers with multiple levels of caching. Unlike previous optimal algorithms, these algorithms are cache oblivious: no variables dependent on hardware parameters, such as cache size and cache-line length, need to be tuned to achieve optimality. Nevertheless, these algorithms use an optimal amount of work and move data optimally among multiple levels of cache. For a cache with size Z and cache-line length L where Z=Ω(L2 ) the number of cache misses for an m×n matrix transpose is Θ(1+mn/L). The number of cache misses for either an n-point FFT or the sorting of n numbers is Θ(1+(n/L)(1+logZn)). We also give an Θ(mnp)-work algorithm to multiply an m×n matrix by an n×p matrix that incurs Θ(1+(mn+np+mp)/L+mnp/L√Z) cache faults. We introduce an “ideal-cache” model to analyze our algorithms. We prove that an optimal cache-oblivious algorithm designed for two levels of memory is also optimal for multiple levels and that the assumption of optimal replacement in the ideal-cache model. Can be simulated efficiently by LRU replacement. We also provide preliminary empirical results on the effectiveness of cache-oblivious algorithms in practice
Keywords :
computational complexity; fast Fourier transforms; matrix algebra; sorting; FFT; LRU replacement; asymptotically optimal algorithms; cache faults; cache misses; cache obliviousness; cache-oblivious algorithms; ideal-cache; rectangular matrix transpose; resource-oblivious algorithms; sorting; Algorithm design and analysis; Banking; Central Processing Unit; Hardware; Laboratories; Sorting; Strontium;
Conference_Titel :
Foundations of Computer Science, 1999. 40th Annual Symposium on
Conference_Location :
New York City, NY
Print_ISBN :
0-7695-0409-4
DOI :
10.1109/SFFCS.1999.814600