DocumentCode
738354
Title
Instruction Cache Locking Using Temporal Reuse Profile
Author
Yun Liang ; Mitra, Tulika ; Lei Ju
Author_Institution
Center for Energy-Efficient Comput. & Applic., Peking Univ., Beijing, China
Volume
34
Issue
9
fYear
2015
Firstpage
1387
Lastpage
1400
Abstract
The performance of most embedded systems is critically dependent on the average memory access latency. Improving the cache hit rate can have significant positive impact on the performance of an application. Modern embedded processors often feature cache locking mechanisms that allow memory blocks to be locked in the cache under software control. Cache locking was primarily designed to offer timing predictability for hard real-time applications. Hence, prior techniques focus on employing cache locking to improve the worst-case execution time. However, cache locking can be quite effective in improving the average-case execution time of general embedded applications as well. In this paper, we explore static instruction cache locking to improve the average-case program performance. We introduce temporal reuse profile (TRP) to accurately and efficiently model the cost and benefit of locking memory blocks in the cache. We consider two locking mechanisms, line locking and way locking. For each locking mechanism, we propose a branch-and-bound algorithm and a heuristic approach that use the TRP to determine the most beneficial memory blocks to be locked in the cache. Experimental results show that the heuristic approach achieves close to the results of branch-and-bound algorithm and can improve the performance by 12% on average for 4 KB cache across a suite of real-world benchmarks. Moreover, our heuristic provides significant improvement compared to the state-of-the-art locking algorithm both in terms of performance and efficiency.
Keywords
cache storage; embedded systems; tree searching; Instruction Cache Locking; TRP; average-case program performance; branch-and-bound algorithm; embedded system; heuristic approach; memory access latency; modern embedded processor; software control; static instruction cache locking; temporal reuse profile; Algorithm design and analysis; Embedded systems; Heuristic algorithms; Optimization; Program processors; Real-time systems; Timing; Cache; Cache Locking; Performance; Temporal Reuse Profile; cache locking; performance; temporal reuse profile (TRP);
fLanguage
English
Journal_Title
Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on
Publisher
ieee
ISSN
0278-0070
Type
jour
DOI
10.1109/TCAD.2015.2418320
Filename
7078946
Link To Document