DocumentCode :
1759238
Title :
Architecture Support for Tightly-Coupled Multi-Core Clusters with Shared-Memory HW Accelerators
Author :
Dehyadegari, Masoud ; Marongiu, Andrea ; Kakoee, Mohammad Reza ; mohammadi, Siamak ; Yazdani, Naser ; Benini, Luca
Author_Institution :
Sch. of Electr. & Comput. Eng., Univ. of Tehran, Tehran, Iran
Volume :
64
Issue :
8
fYear :
2015
fDate :
Aug. 1 2015
Firstpage :
2132
Lastpage :
2144
Abstract :
Coupling processors with acceleration hardware is an effective manner to improve energy efficiency of embedded systems. Many-core is nowadays a dominating design paradigm for SoCs, which opens new challenges and opportunities for designing HW blocks. Exploring acceleration solutions that naturally fit into well-established parallel programming models and that can be incrementally added on top of existing parallel applications is thus extremely important. In this paper we focus on tightly-coupled multi-core cluster architectures, representative of the basic building block of the most recent many-cores, and we enhance it with dedicated HW processing units (HWPU). We propose an architecture where the HWPUs share the same L1 data memory through which processors also communicate, implementing a zero-copy communication model. High-level synthesis (HLS) tools are used to generate HW blocks, then a custom wrapper interfaces the latter to the tightly coupled cluster. We validate our proposal on RTL models, running both synthetic workload and real applications. Experimental results demonstrate that on average our solution provides nearly identical performance to traditional private-memory coarse-grained accelerators, but it achieves up to 32 percent better performance/area/watt and it requires only minimal modifications to legacy parallel codes.
Keywords :
embedded systems; high level synthesis; parallel programming; shared memory systems; system-on-chip; HLS; HW blocks; HW processing units; HWPU; L1 data memory; RTL models; SoC; acceleration hardware; architecture support; dominating design paradigm; embedded systems; energy efficiency; high-level synthesis tools; legacy parallel codes; parallel applications; private-memory coarse-grained accelerators; processor coupling; shared-memory HW accelerators; tightly-coupled multicore clusters; well-established parallel programming models; zero-copy communication model; Acceleration; Computer architecture; Data transfer; Hardware; Program processors; Registers; HW acceleration; Many-core SoC; and software; many-core SoC; parallel architectures; parallel architectures and software;
fLanguage :
English
Journal_Title :
Computers, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9340
Type :
jour
DOI :
10.1109/TC.2014.2360522
Filename :
6915684
Link To Document :
بازگشت