Title :
Early load: Hiding load latency in deep pipeline processor
Author :
Chang, Shun-Chieh ; Li, Walter Yuan-Hwa ; Kuo, Yuan-lung ; Chung, Chung-Ping
Author_Institution :
Dept. of Comput. Sci., Nat. Chiao Tung Univ., Hsinchu
Abstract :
Load instructions usually have long execution latency in a deep processor pipeline, and have significant impact on overall performance. Therefore, how to hide the load latency becomes a serious problem in processor design. The latency of memory load can be separated into two parts: cache-miss latency and load-to-use latency. Previous work which tried to hide the load latency in a deep processor pipeline has some limitations. In this paper, we propose a hardware-based method, called early load, to hide the load-to-use latency with little hardware overhead. Early load scheme allows load instructions to load data from the cache system before it enters the execution stage. In the meantime, a detection method makes sure the correctness of the early operation before the load instruction enters the execution stage. Our experimental results showed that our approach can achieve 11.64% performance improvement in Dhrystone benchmark and 4.97% in average for MiBench benchmark suite.
Keywords :
cache storage; pipeline processing; Dhrystone benchmark; MiBench benchmark suite; cache system; cache-miss latency; deep pipeline processor; early load; hardware-based method; load instructions; load latency hiding; load-to-use latency; memory load latency; processor design; Clocks; Delay; Hardware; Jamming; Out of order; Pipelines; Process design; Processor scheduling; Registers; Throughput;
Conference_Titel :
Computer Systems Architecture Conference, 2008. ACSAC 2008. 13th Asia-Pacific
Conference_Location :
Hsinchu
Print_ISBN :
978-1-4244-2682-9
Electronic_ISBN :
978-1-4244-2683-6
DOI :
10.1109/APCSAC.2008.4625440