A Prediction Based CMP Cache Migration Policy

Author

Hao, Song ; Du, Zhihui ; Bader, David ; Wang, Man

Author_Institution

Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing

fYear

2008

fDate

25-27 Sept. 2008

Firstpage

374

Lastpage

381

Abstract

The large L2 cache´s access latency, which is mainly caused by wire delay, is a critical problem to improve the performance of CMP (Chip Multi-Processor) in NUCA (Non-Uniform Cache Architecture). A CMP L2 cache accessing performance model is provided first to analyze and evaluate the L2 access efficiency in this paper. The total L2 cache access latency problem is formalized as an optimal problem and the lower bound of L2 cache access latency is given based on this model. A novel PBM (Prediction based L2 cache data Migration) algorithm, which employs the sequential prediction technology to identify the data to be accessed in the near future, is designed to migrate the data to be accessed toward their users in early and this method can enable the cores to perform their accesses to the L2 cache in close banks. The analysis results show that this active data migration algorithm can take advantage of the principle of locality to reduce the data access latency much more than the traditional lazy data migration policy. To evaluate the theoretic analysis results, the HMTT toolkit is used to capture the complete memory trace of the SPEC 2000 benchmark running on an SMP computer. The memory trace shows that our prediction technology can work well and at the same time, an L2 cache access simulator is developed to deal with the memory trace data. The simulation experiments show that both the shorter block transfer distance and the lower average access latency can be achieved in the PBM policy. The average block transfer distance can be reduced by up to 16.9%, and the average L2 access latency can be reduced by up to 8.4%.

Keywords

cache storage; memory architecture; microprocessor chips; multiprocessing systems; CMP; L2 cache access latency; NUCA; cache migration policy; chip multiprocessor; nonuniform cache architecture; performance evaluation; sequential prediction; Algorithm design and analysis; Computational modeling; Computer architecture; Computer science; Delay; High performance computing; Information science; Performance analysis; Predictive models; Wire; Chip Multi-Processor; L2-cache; Migration policy; Non-Uniform Cache Architecture; sequential prediction;

fLanguage

English

Publisher

ieee

Conference_Titel

High Performance Computing and Communications, 2008. HPCC '08. 10th IEEE International Conference on

Conference_Location

Dalian

Print_ISBN

978-0-7695-3352-0

Type

conf

DOI

10.1109/HPCC.2008.83

Filename

4637721