Title :
Effective hardware-based data prefetching for high-performance processors
Author :
Chen, Tien-Fu ; Baer, Jean-Loup
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Nat. Chung Cheng Univ., Chiayi, Taiwan
fDate :
5/1/1995 12:00:00 AM
Abstract :
Memory latency and bandwidth are progressing at a much slower pace than processor performance. In this paper, we describe and evaluate the performance of three variations of a hardware function unit whose goal is to assist a data cache in prefetching data accesses so that memory latency is hidden as often as possible. The basic idea of the prefetching scheme is to keep track of data access patterns in a reference prediction table (RPT) organized as an instruction cache. The three designs differ mostly on the timing of the prefetching. In the simplest scheme (basic), prefetches can be generated one iteration ahead of actual use. The lookahead variation takes advantage of a lookahead program counter that ideally stays one memory latency time ahead of the real program counter and that is used as the control mechanism to generate the prefetches. Finally the correlated scheme uses a more sophisticated design to detect patterns across loop levels. These designs are evaluated by simulating the ten SPEC benchmarks on a cycle-by-cycle basis. The results show that 1) the three hardware prefetching schemes all yield significant reductions in the data access penalty when compared with regular caches, 2) the benefits are greater when the hardware assist augments small on-chip caches, and 3) the lookahead scheme is the preferred one cost-performance wise
Keywords :
cache storage; fault tolerant computing; performance evaluation; SPEC benchmarks; data access patterns; data cache; hardware function unit; hardware prefetching schemes; hardware-based data prefetching; high-performance processors; instruction cache; lookahead program counter; lookahead scheme; memory latency; prefetching data accesses; reference prediction table; Bandwidth; Bridges; Coherence; Computer science; Counting circuits; Delay; Hardware; Predictive models; Prefetching; Timing;
Journal_Title :
Computers, IEEE Transactions on