مرکز منطقه ای اطلاع رساني علوم و فناوري - Memory Latency Reduction via Thread Throttling

DocumentCode :

2240932

Title :

Memory Latency Reduction via Thread Throttling

Author :

Cheng, Hsiang-Yun ; Lin, Chung-Hsiang ; Li, Jian ; Yang, Chia-Lin

Author_Institution :

Dept. of Comput. Sci. & Inf. Eng., Nat. Taiwan Univ., Taipei, Taiwan

fYear :

2010

fDate :

4-8 Dec. 2010

Firstpage :

Lastpage :

Abstract :

Memory Wall is a well-known obstacle to processor performance improvement. The popularity of multi-core architecture will further exaggerate the problem since the memory resource is shared by all cores. Interferences among requests from different cores may prolong the latency of memory accesses thereby degrading the system performance. To tackle the problem, this paper proposes to decouple application threads into compute and memory tasks, and restrict the number of concurrent memory tasks to avoid the interference among memory requests. Yet with this scheduling restriction, a CPU core may unnecessarily stay idle, which incurs adverse impact on the overall performance. Therefore, we develop a memory thread throttling mechanism that tunes the allowable memory threads dynamically under workload variation to improve system performance. The proposed run-time mechanism monitors memory and computation ratios of a program for phase detection. It then decides the memory thread constraint for the next program phase based on an analytical model that can estimate system performance under different constraint values. To prove the concept, we prototype the mechanism in some real-world applications as well as synthetic workloads. We evaluate their performance on real machines. The experimental results demonstrate up to 20% speedup with a pool of synthetic workloads on an Intel i7 (Nehalem) machine and match with the speedup estimated by the proposed analytical model. Furthermore, the intelligent run-time scheduling leads to a geometric mean of 12% performance improvement for real-world applications on the same hardware.

Keywords :

multi-threading; processor scheduling; shared memory systems; CPU core; Intel i7 machine; allowable memory threads; analytical model; concurrent memory tasks; decouple application threads; intelligent run-time scheduling; memory latency reduction; memory tasks; memory wall; multicore architecture; phase detection; processor performance improvement; scheduling restriction; thread throttling; workload variation;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Microarchitecture (MICRO), 2010 43rd Annual IEEE/ACM International Symposium on

Conference_Location :

Atlanta, GA

ISSN :

1072-4451

Print_ISBN :

978-1-4244-9071-4

Type :

conf

DOI :

10.1109/MICRO.2010.39

Filename :

5695525

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2240932