DocumentCode :
3356
Title :
ICCI: In-Cache Coherence Information
Author :
Garcia-Guirado, Antonio ; Fernandez-Pascual, Ricardo ; Garcia, Jose M.
Author_Institution :
Dept. de Ing. y Tecnol. de Comput., Univ. de Murcia, Murcia, Spain
Volume :
64
Issue :
4
fYear :
2015
fDate :
Apr-15
Firstpage :
995
Lastpage :
1014
Abstract :
In this paper we introduce ICCI, a new cache organization that leverages shared cache resources and flat coherence protocols to provide inexpensive hardware cache coherence for large core counts (e.g., 512), achieving execution times close to a nonscalable sparse directory while noticeably reducing the energy consumption of the memory system. Very simple changes in the system with respect to traditional bit-vector directories are enough to implement ICCI. Moreover, ICCI does not introduce any storage overhead with respect to a broadcast-based protocol, yet it provides large storage space for coherence information. ICCI makes smarter use of cache resources by dynamically allowing last-level cache entries to store blocks or sharing codes. This way, just the minimum number of directory entries required at runtime are allocated. Besides, ICCI suffers a negligible amount of directory-induced invalidations. Results for a 512-core CMP show that ICCI reduces the energy consumption of the memory system by up to 48 percent compared to a tag-embedded directory, up to 15 percent compared to a sparse directory, and up to 8 percent compared to the state-of-the-art Scalable Coherence Directory which ICCI also outperforms in execution time. In addition, ICCI can be used in combination with elaborated sharing codes to apply it to extremely large core counts. We also show analytically that ICCI´s dynamic allocation of entries makes it a suitable candidate to store coherence information efficiently for very large core counts (e.g., over 200K cores), based on the observation that data sharing makes fewer directory entries necessary per core as core count increases.
Keywords :
cache storage; energy consumption; multiprocessing systems; power aware computing; protocols; 512-core CMP; ICCI; bit-vector directories; broadcast-based protocol; directory-induced invalidations; energy consumption; flat coherence protocols; hardware cache coherence; in-cache coherence information; last-level cache entry; memory system; nonscalable sparse directory; scalable coherence directory; shared cache resources; tag-embedded directory; Benchmark testing; Coherence; Organizations; Proposals; Protocols; Scalability; Unicast; Cache coherence; cache organization; energy-efficiency; multi-core; scalability;
fLanguage :
English
Journal_Title :
Computers, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9340
Type :
jour
DOI :
10.1109/TC.2014.2308185
Filename :
6747961
Link To Document :
بازگشت