DocumentCode :
2540677
Title :
Design and performance of directory caches for scalable shared memory multiprocessors
Author :
Michael, Maged M. ; Nanda, Ashwini K.
Author_Institution :
IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
fYear :
1999
fDate :
9-13 Jan 1999
Firstpage :
142
Lastpage :
151
Abstract :
Recent research shows that the occupancy of the coherence controllers is a major performance bottleneck for distributed cache coherent shared memory multiprocessors. A significant part of the occupancy is due to the latency of accessing the directory which is usually kept in DRAM memory. Most coherence controller designs that use protocol processors for executing the coherence protocol handlers use the data cache of the protocol processor for caching directory entries along with protocol handler data. Analogously, a fast Directory Cache (DC) can also be used by the hardwired coherence controller designs to minimize directory access time. The paper studies the performance of directory caches using parallel applications from the SPLASH-2 suite. We demonstrate that using a directory cache can result in 40% or more improvement in the execution time of communication intensive applications. We also investigate the various directory cache design parameters: cache size, cache line size, and associativity. Experimental results show that the directory cache size requirements grow sub-linearly with the increase in the application´s data set size. The results also show the performance advantage of multi-entry directory cache lines, as a result of spatial locality and the absence of sharing of directories. The impact of the associativity of the directory caches on performance is less than that of the size and the line size. We also find a linear relation between the directory cache miss ratio and the coherence controller occupancy, and between both measures and the execution time of the applications
Keywords :
cache storage; parallel programming; protocols; shared memory systems; storage management; DRAM memory; SPLASH-2 suite; associativity; cache line size; cache size; coherence controller designs; coherence controller occupancy; coherence controllers; coherence protocol handlers; communication intensive applications; data cache; data set size; directory access time; directory cache design parameters; directory cache miss ratio; directory caches; directory entries; distributed cache coherent shared memory multiprocessors; execution time; hardwired coherence controller designs; line size; linear relation; multi-entry directory cache lines; parallel applications; performance bottleneck; protocol handler data; protocol processors; scalable shared memory multiprocessors; spatial locality; Access protocols; Delay; Ear; Electronic switching systems; Hardware; National electric code; Random access memory; Read only memory; Roads; Time measurement;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High-Performance Computer Architecture, 1999. Proceedings. Fifth International Symposium On
Conference_Location :
Orlando, FL
Print_ISBN :
0-7695-0004-8
Type :
conf
DOI :
10.1109/HPCA.1999.744354
Filename :
744354
Link To Document :
بازگشت