DocumentCode :
2977737
Title :
Support for High-Frequency Streaming in CMPs
Author :
Rangan, Ram ; Vachharajani, Neil ; Stoler, Adam ; Ottoni, Guilherme ; August, David I. ; Cai, George Z N
Author_Institution :
Dept. of Comput. Sci. & Electr. Eng., Princeton Univ., NJ
fYear :
2006
fDate :
9-13 Dec. 2006
Firstpage :
259
Lastpage :
272
Abstract :
As the industry moves toward larger-scale chip multiprocessors, the need to parallelize applications grows. High inter-thread communication delays, exacerbated by over-stressed high-latency memory subsystems and ever-increasing wire delays, require parallelization techniques to create partially or fully independent threads to improve performance. Unfortunately, developers and compilers alike often fail to find sufficient independent work of this kind. Recently proposed pipelined streaming techniques have shown significant promise for both manual and automatic parallelization. These techniques have wide-scale applicability because they embrace inter-thread dependences (albeit acyclic dependences) and tolerate long-latency communication of these dependences. This paper addresses the lack of architectural support for this type of concurrency, which has blocked its adoption and hindered related language and compiler research. We observe that both manual and automatic techniques create high-frequency streaming threads, with communication occurring every 5 to 20 instructions. Even while easily tolerating inter-thread transit delays, high-frequency communication makes thread performance very sensitive to intra-thread delays from the repeated execution of the communication operations. Using this observation, we define the design-space and evaluate several mechanisms to find a better trade-off between performance and operating system, hardware, and design costs. From this, we find a light-weight streaming-aware enhancement to conventional memory subsystems that doubles the speed of these codes and is within 2% of the best-performing, but heavy-weight, hardware solution
Keywords :
microprocessor chips; multi-threading; program compilers; shared memory systems; compiler research; high-frequency streaming threads; inter-thread communication delays; larger-scale chip multiprocessors; pipelined streaming; shared memory CMP; Application software; Costs; Delay; Hardware; Manuals; Operating systems; Pipeline processing; Program processors; Telecommunication traffic; Yarn;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Microarchitecture, 2006. MICRO-39. 39th Annual IEEE/ACM International Symposium on
Conference_Location :
Orlando, FL
ISSN :
1072-4451
Print_ISBN :
0-7695-2732-9
Type :
conf
DOI :
10.1109/MICRO.2006.47
Filename :
4041852
Link To Document :
بازگشت