DocumentCode :
969394
Title :
Exploiting locality for irregular scientific codes
Author :
Han, Hwansoo ; Tseng, Chau-Wen
Author_Institution :
Div. of Comput. Sci., KAIST, Daejeon
Volume :
17
Issue :
7
fYear :
2006
fDate :
7/1/2006 12:00:00 AM
Firstpage :
606
Lastpage :
618
Abstract :
Irregular scientific codes experience poor cache performance due to their irregular memory access patterns. In this paper, we present two new locality improving techniques for irregular scientific codes. Our techniques exploit geometric structures hidden in data access patterns and computation structures. Our new data reordering (GPART) finds the graph structure within data accesses and applies hierarchical clustering. Quality partitions are constructed quickly by clustering multiple neighbor nodes with priority on nodes with high degree and repeating a few passes. Overhead is kept low by clustering multiple nodes in each pass and considering only edges between partitions. Our new computation reordering (Z-SORT) treats the values of index arrays as coordinates and reorders corresponding computations in Z-curve order. Applied to dense inputs, Z-SORT achieves performance close to data reordering combined with other computation reordering but without the overhead involved in data reordering. Experiments on irregular scientific codes for a variety of meshes show locality optimization techniques are effective for both sequential and parallelized codes, improving performance by 60-87 percent. GPART achieved within 1-2 percent of the performance of more sophisticated partitioning algorithms, but with one third of the overhead. Z-SORT also yields the performance improvement of 64 percent for dense inputs, which is comparable with data reordering combined with computation reordering
Keywords :
cache storage; data structures; optimising compilers; Z-SORT computation reordering; Z-curve order; data access; data access patterns; data reordering; graph structure; hierarchical clustering; index arrays; irregular scientific codes; locality optimization techniques; Cache memory; Computational fluid dynamics; Computer architecture; Concurrent computing; Data mining; Data structures; Fluid dynamics; Microprocessors; Parallel processing; Partitioning algorithms; Compiler optimization; cache memories; computation reordering.; data reordering; inspector/executor;
fLanguage :
English
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on
Publisher :
ieee
ISSN :
1045-9219
Type :
jour
DOI :
10.1109/TPDS.2006.88
Filename :
1642638
Link To Document :
بازگشت