مرکز منطقه ای اطلاع رساني علوم و فناوري - A Locality-based Performance Model for Load-and-Compute Style Computation

DocumentCode :

1925827

Title :

A Locality-based Performance Model for Load-and-Compute Style Computation

Author :

Yuan, Liang ; Zhang, Yunquan

Author_Institution :

Lab. of Parallel Software & Comput. Sci., Inst. of Software, Beijing, China

fYear :

2012

fDate :

24-28 Sept. 2012

Firstpage :

566

Lastpage :

571

Abstract :

The increasing speed gap between the processor and memory is usually the critical bottleneck in achieving high performance. Hardware caches, programming models, algorithms and data structures have been introduced and proposed to exploit localities on reducing the memory overhead. Some of these new designs share a common load and compute style in which the algorithm first moves all needed data to cache and then performs operations only on the ready data. In this paper, we introduce a locality function to model the reuse ability of an algorithm and propose a corresponding performance model. Then we theoretically analyze how to utilize and design on cache under our model: (1) We present theorems to give the optimal cache partition scheme for the software buffering technique targeting at hiding the memory overhead. (2) We provide methods to decide the optimal multicore design to maximally leverage benefits of both the shared and private caches. (3) We incorporate the memory overhead into the Amdahl´s Law to study the speedup limitation on memory bandwidth.

Keywords :

cache storage; data structures; multiprocessing systems; Amdahl Law; algorithm reuse ability; data movement; data structure; hardware cache; load-and-compute style computation; locality function; locality-based performance model; memory bandwidth; memory overhead hiding; memory overhead reduction; memory speed; optimal cache partition scheme; optimal multicore design; performance bottleneck; private cache; processor speed; programming algorithm; programming model; shared cache; software buffering technique; speedup limitation; Bandwidth; Computational modeling; Equations; Load modeling; Mathematical model; Multicore processing; System-on-a-chip; cache partition; locality function; private cache; shared cache;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Cluster Computing (CLUSTER), 2012 IEEE International Conference on

Conference_Location :

Beijing

Print_ISBN :

978-1-4673-2422-9

Type :

conf

DOI :

10.1109/CLUSTER.2012.25

Filename :

6337824

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1925827