Title :
Efficient and accurate analytical modeling of whole-program data cache behavior
Author :
Xue, Jingling ; Vera, Xavier
Author_Institution :
Sch. of Comput. Sci. & Eng., New South Wales Univ., Sydney, NSW, Australia
fDate :
5/1/2004 12:00:00 AM
Abstract :
Data caches are a key hardware means to bridge the gap between processor and memory speeds, but only for programs that exhibit sufficient data locality in their memory accesses. Thus, a method for evaluating cache performance is required to both determine quantitatively cache misses and to guide data cache optimizations. Existing analytical models for data cache optimizations target mainly isolated perfect loop nests. We present an analytical model that is capable of statically analyzing not only loop nest fragments, but also complete numerical programs with regular and compile-time predictable memory accesses. Central to the whole-program approach are abstract call inlining, memory access vectors, and parametric reuse analysis, which allow the reuse and interference both within and across loop nests to be quantified precisely in a unified framework. Based on the framework, the cache misses of a program are specified using mathematical formulas and the miss ratio is predicted from these formulas based on statistical sampling techniques. Our experimental results using kernels and whole programs indicate accurate cache miss estimates in a substantially shorter amount of time (typically, several orders of magnitude faster) than simulation.
Keywords :
cache storage; operating system kernels; optimisation; performance evaluation; program control structures; program processors; statistical analysis; abstract call inlining; analytical modelling techniques; cache misses; cache performance; compile-time; complete numerical programs; data cache behavior; data cache optimizations; data locality; kernels; loop nest fragments; memory access vectors; parametric reuse analysis; statistical sampling techniques; Analytical models; Bridges; Cache memory; Computational modeling; Hardware; Interference; Kernel; Optimization methods; Optimizing compilers; Sampling methods;
Journal_Title :
Computers, IEEE Transactions on
DOI :
10.1109/TC.2004.1275296