Data Allocation for Embedded Systems with Hybrid On-Chip Scratchpad and Caches

Author

Guanhua Wang ; Lei Ju ; Zhiping Jia ; Xin Li

Author_Institution

Sch. of Comput. Sci. & Technol., Shandong Univ., Jinan, China

fYear

2013

fDate

13-15 Nov. 2013

Firstpage

366

Lastpage

373

Abstract

The memory subsystem is the performance bottleneck for data intensive applications, which makes it a key consideration in high-performance embedded system optimization. On-chip SRAMs including scratchpad memories (SPMs) and caches are widely used in embedded systems to narrow the speed gap between CPU and memory. However, many existing SPM data allocation algorithms are designed for architectures with pure SPM as on-chip SRAM. As a result, for off-the-shelf embedded processors with hybrid on-chip scratchpad and caches, these algorithms may not lead to an optimal overall performance due to the lack of consideration for cache behaviors. In this paper, we propose an comprehensive data allocation framework for the above-mentioned architectures. We formulate a cache-aware integer linear programming (ILP) model to identify possible memory objects to be allocated into SPM for average case execution time improvement. The impact of SPM allocation on cache interferences are captured to reduce the overall number of slow off-chip memory accesses. Experimental results show that such a hybrid on-chip SRAM organization outperforms the pure cache or SPM architecture with our data allocation mechanism. We evaluate the execution cycles via selecting data-intensive benchmarks for different SPM-cache size combinations, where up to 25.4% total execution cycle reduction is achieved.

Keywords

SRAM chips; cache storage; embedded systems; integer programming; linear programming; CPU; ILP; SPM data allocation algorithms; SPM-cache size combinations; SRAM organization; cache-aware integer linear programming model; caches; data allocation; data allocation framework; data intensive applications; execution cycles; high-performance embedded system optimization; hybrid on-chip scratchpad; memory subsystem; off-chip memory accesses; off-the-shelf embedded processors; on-chip SRAM; performance bottleneck; scratchpad memories; total execution cycle reduction; Arrays; Embedded systems; Optimization; Program processors; Random access memory; Resource management; System-on-chip; Performance optimization; caches; data allocation; scratchpad memory;

fLanguage

English

Publisher

ieee

Conference_Titel

High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing (HPCC_EUC), 2013 IEEE 10th International Conference on

Conference_Location

Zhangjiajie

Type

conf

DOI

10.1109/HPCC.and.EUC.2013.60

Filename

6831942