DocumentCode :
2480825
Title :
Mamba: A scalable communication centric multi-threaded processor architecture
Author :
Chadwick, Gregory A. ; Moore, Simon W.
Author_Institution :
Comput. Lab., Univ. of Cambridge, Cambridge, UK
fYear :
2012
fDate :
Sept. 30 2012-Oct. 3 2012
Firstpage :
277
Lastpage :
283
Abstract :
In this paper we describe Mamba, an architecture designed for multi-core systems. Mamba has two major aims: (i) make on-chip communication explicit to the programmer so they can optimize for it and (ii) support many threads and supply very lightweight communication and synchronization primitives for them. These aims are based on the observations that: (i) as feature sizes shrink, on-chip communication becomes relatively more expensive than computation and (ii) as we go increasingly multi-core we need highly scalable approaches to inter-thread communication and synchronization. We employ a network of processors where a given memory access will always go to the same cache, removing the need for a coherence protocol and allowing the program explicit control over all communication. A presence bit associated with each word provides a very lightweight, finegrained synchronization primitive. We demonstrate an FPGA implementation with micro-benchmarks of standard spinlock and FIFO implementations and show that presence bit based implementations provide more efficient locking, and lower latency FIFO communications compared to a conventional shared memory implementation whilst also requiring fewer memory accesses. We also show that Mamba performance is insensitive to total thread count, allowing the use of as many threads as desired.
Keywords :
cache storage; field programmable gate arrays; multi-threading; multiprocessing systems; parallel memories; queueing theory; synchronisation; FIFO; FPGA; Mamba; bit based implementation; cache storage; fine grained synchronization primitive; interthread communication; lightweight communication; memory access; multicore system; multithreaded processor architecture; on-chip communication; optimization; scalable communication; Benchmark testing; Computer architecture; Field programmable gate arrays; Instruction sets; Message systems; Registers;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Design (ICCD), 2012 IEEE 30th International Conference on
Conference_Location :
Montreal, QC
ISSN :
1063-6404
Print_ISBN :
978-1-4673-3051-0
Type :
conf
DOI :
10.1109/ICCD.2012.6378652
Filename :
6378652
Link To Document :
بازگشت