مرکز منطقه ای اطلاع رساني علوم و فناوري - Strategies for mapping dataflow blocks to distributed hardware

DocumentCode :

2579046

Title :

Strategies for mapping dataflow blocks to distributed hardware

Author :

Robatmili, Behnam ; Coons, Katherine E. ; Burger, Doug ; McKinley, Kathryn S.

Author_Institution :

Dept. of Comput. Sci., Univ. of Texas at Austin, Austin, TX

fYear :

2008

fDate :

8-12 Nov. 2008

Firstpage :

Lastpage :

Abstract :

Distributed processors must balance communication and concurrency. When dividing instructions among the processors, key factors are the available concurrency, criticality of dependence chains, and communication penalties. The amount of concurrency determines the importance of the other factors: if concurrency is high, wider distribution of instructions is likely to tolerate the increased operand routing latencies. If concurrency is low, mapping dependent instructions close to one another is likely to reduce communication costs that contribute to the critical path. This paper explores these tradeoffs for distributed Explicit Dataflow Graph Execution (EDGE) architectures that execute blocks of dataflow instructions atomically. A runtime block mapper assigns instructions from a single thread to distributed hardware resources (cores) based on compiler-assigned instruction identifiers. We explore two approaches: fixed strategies that map all blocks to the same number of cores, and adaptive strategies that vary the number of cores for each block. The results show that best fixed strategy varies, based on the corespsila issue width. A simple adaptive strategy improves performance over the best fixed strategies for single and dual-issue cores, but its benefits decrease as the corespsila issue width increases. These results show that by choosing an appropriate runtime block mapping strategy, average performance can be increased by 18%, while simultaneously reducing average operand communication by 70%, saving energy as well as improving performance. These results indicate that runtime block mapping is a promising mechanism for balancing communication and concurrency in distributed processors.

Keywords :

concurrency control; data flow graphs; program compilers; program diagnostics; resource allocation; adaptive strategy; compiler-assigned instruction identifier; concurrency balancing; dataflow block mapping; distributed explicit dataflow graph execution architecture; distributed hardware resource; distributed processor; fixed strategy; operand routing latency; runtime block mapper; Concurrent computing; Costs; Delay; Distributed computing; Hardware; Instruction sets; Intrusion detection; Parallel processing; Routing; Runtime;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Microarchitecture, 2008. MICRO-41. 2008 41st IEEE/ACM International Symposium on

Conference_Location :

Lake Como

ISSN :

1072-4451

Print_ISBN :

978-1-4244-2836-6

Electronic_ISBN :

1072-4451

Type :

conf

DOI :

10.1109/MICRO.2008.4771776

Filename :

4771776

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2579046