DocumentCode :
2540684
Title :
Switch cache: a framework for improving the remote memory access latency of CC-NUMA multiprocessors
Author :
Iyer, Ravi ; Bhuyan, Laxmi Narayan
Author_Institution :
Dept. of Comput. Sci., Texas A&M Univ., College Station, TX, USA
fYear :
1999
fDate :
9-13 Jan 1999
Firstpage :
152
Lastpage :
160
Abstract :
Cache coherent non-uniform memory access (CC-NUMA) multiprocessors continue to suffer from remote memory access latencies due to comparatively slow memory technology and data transfer latencies in the interconnection network. We propose a novel hardware caching technique, called switch cache. The main idea is to implement small fast caches in crossbar switches of the interconnect medium to capture and store shared data as they flow from the memory module to the requesting processor. This stored data acts as a cache for subsequent requests, thus reducing the latency of remote memory accesses tremendously. The implementation of a cache in a crossbar switch needs to be efficient and robust, yet flexible for changes in the caching protocol. The design and implementation details of a CAche Embedded Switch ARchitecture, CAESAR, using wormhole routing with virtual channels is presented. Using detailed execution-driven simulations, we find that the CAESAR switch cache is capable of improving the performance of CC-NUMA multiprocessors by reducing the number of reads served at distant remote memories by up to 45% and improving the application execution time by as high as 20%. We conclude that the switch caches provide a cost-effective solution for designing high performance CC-NUMA multiprocessors
Keywords :
cache storage; multiprocessor interconnection networks; network routing; storage management; CAESAR; CAche Embedded Switch ARchitecture; CC-NUMA multiprocessors; application execution time; cache coherent non-uniform memory access multiprocessors; caching protocol; cost-effective solution; crossbar switches; data transfer latencies; distant remote memories; execution-driven simulations; hardware caching technique; interconnection network; memory module; memory technology; remote memory access latencies; remote memory access latency; remote memory accesses; requesting processor; shared data; small fast caches; switch cache; virtual channels; wormhole routing; Access protocols; Computer science; Computer worms; Costs; Delay; Postal services; Random access memory; Read only memory; Switches;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High-Performance Computer Architecture, 1999. Proceedings. Fifth International Symposium On
Conference_Location :
Orlando, FL
Print_ISBN :
0-7695-0004-8
Type :
conf
DOI :
10.1109/HPCA.1999.744357
Filename :
744357
Link To Document :
بازگشت