• DocumentCode
    738354
  • Title

    Instruction Cache Locking Using Temporal Reuse Profile

  • Author

    Yun Liang ; Mitra, Tulika ; Lei Ju

  • Author_Institution
    Center for Energy-Efficient Comput. & Applic., Peking Univ., Beijing, China
  • Volume
    34
  • Issue
    9
  • fYear
    2015
  • Firstpage
    1387
  • Lastpage
    1400
  • Abstract
    The performance of most embedded systems is critically dependent on the average memory access latency. Improving the cache hit rate can have significant positive impact on the performance of an application. Modern embedded processors often feature cache locking mechanisms that allow memory blocks to be locked in the cache under software control. Cache locking was primarily designed to offer timing predictability for hard real-time applications. Hence, prior techniques focus on employing cache locking to improve the worst-case execution time. However, cache locking can be quite effective in improving the average-case execution time of general embedded applications as well. In this paper, we explore static instruction cache locking to improve the average-case program performance. We introduce temporal reuse profile (TRP) to accurately and efficiently model the cost and benefit of locking memory blocks in the cache. We consider two locking mechanisms, line locking and way locking. For each locking mechanism, we propose a branch-and-bound algorithm and a heuristic approach that use the TRP to determine the most beneficial memory blocks to be locked in the cache. Experimental results show that the heuristic approach achieves close to the results of branch-and-bound algorithm and can improve the performance by 12% on average for 4 KB cache across a suite of real-world benchmarks. Moreover, our heuristic provides significant improvement compared to the state-of-the-art locking algorithm both in terms of performance and efficiency.
  • Keywords
    cache storage; embedded systems; tree searching; Instruction Cache Locking; TRP; average-case program performance; branch-and-bound algorithm; embedded system; heuristic approach; memory access latency; modern embedded processor; software control; static instruction cache locking; temporal reuse profile; Algorithm design and analysis; Embedded systems; Heuristic algorithms; Optimization; Program processors; Real-time systems; Timing; Cache; Cache Locking; Performance; Temporal Reuse Profile; cache locking; performance; temporal reuse profile (TRP);
  • fLanguage
    English
  • Journal_Title
    Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0278-0070
  • Type

    jour

  • DOI
    10.1109/TCAD.2015.2418320
  • Filename
    7078946