Title :
The MIT Alewife machine: architecture and performance
Author :
Agarwal, Anant ; Bianchini, Ricardo ; Chaiken, David ; Johnson, Kirk L. ; Kranz, David ; Kubiatowicz, John ; Lim, Beng-Hong ; Mackenzie, Kenneth ; Yeung, Donald
Author_Institution :
Lab. for Comput. Sci., MIT, Cambridge, MA, USA
Abstract :
Alewife is a multiprocessor architecture that supports up to 512 processing nodes connected over a scalable and cost-effective mesh network at a constant cost per node. The MIT Alewife machine, a prototype implementation of the architecture, demonstrates that a parallel system can be both scalable and programmable. Four mechanisms combine to achieve these goals: software-extended coherent shared memory provides a global, linear address space; integrated message passing allows compiler and operating system designers to provide efficient communication and synchronization; support for fine-grain computation allows many processors to cooperate on small problem sizes; and latency tolerance mechanisms-including block multithreading and prefetching-mask unavoidable delays due to communication. Microbenchmarks, together with over a dozen complete applications running on the 32-node prototype, help to analyze the behavior of the system. Analysis shows that integrating message passing with shared memory enables a cost-efficient solution to the cache coherence problem and provides a rich set of programming primitives. Block multithreading and prefetching improve performance by up to 25%, individually, and 35% together. Finally, language constructs that allow programmers to express fine-grain synchronization can improve performance by over a factor of two.
Keywords :
cache storage; message passing; operating systems (computers); parallel architectures; parallel machines; parallel programming; performance evaluation; program compilers; reconfigurable architectures; shared memory systems; synchronisation; MIT Alewife machine; block multithreading; communication; compiler designers; constant cost; cost-effective mesh network; fine-grain computation; global linear address space; integrated message passing; latency tolerance mechanisms; microbenchmarks; multiprocessor architecture; operating system designers; parallel system; performance; prefetching; processing nodes; programmable system; prototype implementation; scalable mesh network; software-extended coherent shared memory; synchronization; Computer architecture; Costs; Delay; Mesh networks; Message passing; Multithreading; Operating systems; Prefetching; Prototypes; Software prototyping;
Conference_Titel :
Computer Architecture, 1995. Proceedings., 22nd Annual International Symposium on
Conference_Location :
Santa Margherita Ligure, Italy
Print_ISBN :
0-89791-698-0