DocumentCode :
3443941
Title :
Redundant linked list based cache coherence protocol
Author :
Li, Qiang ; Vlaovic, Stevan
Author_Institution :
Dept. of Comput. Eng., Santa Clara Univ., CA, USA
fYear :
1994
fDate :
12-14 Jun 1994
Firstpage :
43
Lastpage :
50
Abstract :
This article presents a distributed directory based cache coherence protocol that improves performance and facilitates error recovery in large scale multiprocessors. A number of distributed directory based protocols, such as the Scalable Coherent Interface (SCI, ANSI/IEEE Std 1596), use a linked list structure to maintain cache coherence. While they work well for small to medium size systems, the list traversal overhead becomes high when the system size grows into the thousands of processors range. Also, the system is vulnerable to a single node failure in that the recovery from such a failure involves all the processors in the system. Single node failure can happen relatively frequently when a protocol is applied to SCI-based Local Area MultiProcessors (LAMP) where individual nodes are autonomous computers and can power up and down individually. We propose an enhancement to the linked list approach. A redundant spanning list is constructed when the list is built, which achieves two goals: 1) the list traversal time is reduced from O(N) to O(√N) and 2) recovery from single node failure is confined to the processors involved in the failed list, unless the head of the list is lost
Keywords :
cache storage; data structures; distributed memory systems; fault tolerant computing; memory protocols; multiprocessing systems; system buses; system recovery; ANSI/IEEE Std 1596; SCI; SCI-based local area multiprocessors; Scalable Coherent Interface; autonomous computers; error recovery; large scale multiprocessors; performance; redundant linked list based cache coherence protocol; redundant spanning list; single node failure; Bandwidth; Computer errors; Delay; Distributed computing; Hardware; Lamps; Large-scale systems; Maintenance engineering; Protocols; Workstations;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fault-Tolerant Parallel and Distributed Systems, 1994., Proceedings of IEEE Workshop on
Conference_Location :
College Station, TX
Print_ISBN :
0-8186-6807-5
Type :
conf
DOI :
10.1109/FTPDS.1994.494473
Filename :
494473
Link To Document :
بازگشت