Title :
Non-strict cache coherence: exploiting data-race tolerance in emerging applications
Author :
Tambat, Siddhartha V. ; Vajapeyam, Sriram
Author_Institution :
Dept. of Comput. Sci. & Autom., Indian Inst. of Sci., Bangalore, India
Abstract :
Software distributed shared memory (DSM) platforms on networks of workstations tolerate large network latencies by employing one of several weak memory consistency models. Data-race tolerant applications, such as Genetic Algorithms (GAs), Probabilistic Inference, etc., offer an additional degree of freedom to tolerate network latency: they do not synchronize shared memory references, and behave correctly when supplied outdated shared data. However, these algorithms often have a high communication-to-computation ratio and can flood the network with messages in the presence of large message delays. We study the performance of controlled asynchronous implementations of these algorithms via the use of our previously proposed blocking Global Read memory access primitive. Global Read implements non-strict cache coherence by guaranteeing to return to the reader a shared datum value from within a specified staleness range. Experiments on an IBM SP2 multicomputer with an Ethernet show significant performance improvements for controlled asynchronous implementations. On a lightly loaded Ethernet network, most of the GA benchmarks see 30% to 40% improvement over the best competitor for 2 to 16 processors, while two of the Probabilistic Inference benchmarks see more than 80% improvement for 2 processors. As the network load increases, the benefits of non-strict cache coherence increase significantly
Keywords :
cache storage; distributed shared memory systems; fault tolerant computing; hazards and race conditions; performance evaluation; workstation clusters; cache coherence; data-race tolerance; distributed shared memory; networks of workstations; performance; Application software; Automation; Communication system control; Computer science; Costs; Ethernet networks; Genetic algorithms; Intelligent networks; Propagation delay; Workstations;
Conference_Titel :
Parallel Processing, 2000. Proceedings. 2000 International Conference on
Conference_Location :
Toronto, Ont.
Print_ISBN :
0-7695-0768-9
DOI :
10.1109/ICPP.2000.876082