مرکز منطقه ای اطلاع رساني علوم و فناوري - On-chip cache hierarchy-aware tile scheduling for multicore machines

DocumentCode :

3105833

Title :

On-chip cache hierarchy-aware tile scheduling for multicore machines

Author :

Liu, Jun ; Zhang, Yuanrui ; Ding, Wei ; Kandemir, Mahmut

Author_Institution :

Dept. of Comput. Sci. & Eng., Pennsylvania State Univ., University Park, PA, USA

fYear :

2011

fDate :

2-6 April 2011

Firstpage :

161

Lastpage :

170

Abstract :

Iteration space tiling and scheduling is an important technique for optimizing loops that constitute a large fraction of execution times in computation kernels of both scientific codes and embedded applications. While tiling has been studied extensively in the context of both uniprocessor and multiprocessor platforms, prior research has paid less attention to tile scheduling, especially when targeting multicore machines with deep on-chip cache hierarchies. In this paper, we propose a cache hierarchy-aware tile scheduling algorithm for multicore machines, with the purpose of maximizing both horizontal and vertical data reuses in on-chip caches, and balancing the workloads across different cores. This scheduling algorithm is one of the key components in a source-to-source translation tool that we developed for automatic loop parallelization and multithreaded code generation from sequential codes. To the best of our knowledge, this is the first effort that develops a fully-automated tile scheduling strategy customized for on-chip cache topologies of multicore machines. The experimental results collected by executing twelve application programs on three commercial Intel machines (Nehalem, Dunnington, and Harpertown) reveal that our cache-aware tile scheduling brings about 27.9% reduction in cache misses, and on average, 13.5% improvement in execution times over an alternate method tested.

Keywords :

cache storage; embedded systems; multi-threading; multiprocessing systems; processor scheduling; program control structures; resource allocation; automatic loop parallelization; cache hierarchy-aware tile scheduling algorithm; computation kernels; deep on-chip cache hierarchies; embedded applications; horizontal data reuses; iteration space tiling; multicore machines; multiprocessor platforms; multithreaded code generation; on-chip cache hierarchy-aware tile scheduling; on-chip caches; optimizing loops; scientific codes; sequential codes; source-to-source translation tool; uniprocessor platforms; vertical data reuses; workload balancing; Multicore processing; Optimization; Schedules; Scheduling algorithm; Shape; System-on-a-chip; Tiles;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Code Generation and Optimization (CGO), 2011 9th Annual IEEE/ACM International Symposium on

Conference_Location :

Chamonix

Print_ISBN :

978-1-61284-356-8

Electronic_ISBN :

978-1-61284-358-2

Type :

conf

DOI :

10.1109/CGO.2011.5764684

Filename :

5764684

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3105833