Title :
Load-balancing branch target cache and prefetch buffer
Author :
Chi, Chi-hung ; Yuan, Jun-Li
Author_Institution :
Sch. of Comput., Nat. Univ. of Singapore, Singapore
Abstract :
Sophisticated branch prediction and compiler optimization technologies result in a higher predictability of instruction references, thus making the branch target cache and prefetch buffer (BTC+PB) design appealing. However, it is surprising to find that this BTC+PB design actually performs worse than the non-partitioned instruction cache. Further investigation shows that this degradation is mainly due to the limited bus bandwidth available for prefetching. To make up for this situation, we propose two load-balancing mechanisms for the BTC+PB design: multi-blocks target (MBT) and dynamic prefetched instruction placement (DIP) techniques. The basic ideas of these two techniques are to tradeoff cache space for bus bandwidth once the bus is found to be overloaded by prefetching. The resulting cache, called the LB+PB design, is found to have superior performance over current non-partitioned instruction cache designs do. Based on the SPEC95, the memory latency due to instruction references can be reduced by an average of 5% to 15%, with some benchmarks whose improvement can go up to over 50%
Keywords :
buffer storage; cache storage; performance evaluation; resource allocation; SPEC95; benchmarks; branch prediction; branch target cache; compiler optimization; dynamic prefetched instruction placement technique; instruction references; limited bus bandwidth; load balancing; memory latency; multi-blocks target technique; prefetch buffer; Bandwidth; Computer aided instruction; Degradation; Delay; Electronics packaging; Instruction sets; Optimizing compilers; Performance analysis; Prefetching;
Conference_Titel :
Computer Design, 1999. (ICCD '99) International Conference on
Conference_Location :
Austin, TX
Print_ISBN :
0-7695-0406-X
DOI :
10.1109/ICCD.1999.808578