DocumentCode
1920546
Title
An Accurate Prefetch Technique for Dynamic Paging Behaviour for Software Distributed Shared Memory
Author
Cai, Jie ; Strazdins, Peter E.
Author_Institution
Res. Sch. of Comput. Sci., Australian Nat. Univ., Canberra, ACT, Australia
fYear
2012
fDate
10-13 Sept. 2012
Firstpage
209
Lastpage
218
Abstract
Page-based software Distributed Shared Memory (sDSM) systems suffer from their high memory consistency costs. Utilizing an effective prefetch technique can reduce this overhead. However, it is hard to predict accurately for applications exhibiting dynamic memory accessing and paging behavior. In this paper, we use Intel Cluster OpenMP (CLOMP) to study this problem. First, we present a stride augmented run-length encoding (sRLE) method to reconstruct series of numbers into 2D rectangles which facilitates a more accurate paging behavior analysis. Historical page miss records of OpenMP parallel and sequential regions are reconstructed and compressed by sRLE. Second, we design and implement a dynamic page prefetch technique (DReP) based on these reconstructed records to predict and issue prefetches. DReP and its implementation are evaluated through simulations and experiments. The simulation results show that DReP significantly improves the efficiency (~34%) and coverage (~47%) of existing prefetch techniques. Moreover, the experimental results show that DReP significantly reduces the memory consistency costs of CLOMP by 86% for extreme false sharing scenario. With the assistance of sRLE, DReP reduces ~45% and ~38% memory consistency costs for LINPACK and NPB-OMP benchmarks on GigE and DDR IB networks respectively. An detailed breakdown analysis shows that the introduced software overhead of DReP is negligible (~2%).
Keywords
distributed shared memory systems; runlength codes; storage management; CLOMP; DDR IB network; DReP; GigE network; Intel cluster OpenMP; LINPACK; NPB-OMP benchmark; OpenMP parallel; dynamic memory accessing; dynamic page prefetch technique; extreme false sharing; high memory consistency cost; page miss record; page-based software distributed shared memory; paging behavior; sDSM system; sRLE method; sequential region; software overhead; stride augmented run-length encoding; Arrays; Benchmark testing; Encoding; Equations; Prefetching; Synchronization; Dynamic Memory Pattern; Prefetch; Run-Length Encoding; Software DSM;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel Processing (ICPP), 2012 41st International Conference on
Conference_Location
Pittsburgh, PA
ISSN
0190-3918
Print_ISBN
978-1-4673-2508-0
Type
conf
DOI
10.1109/ICPP.2012.16
Filename
6337582
Link To Document