Title :
Run-time reference clustering for cache performance optimization
Author :
Kaplow, Wesley K. ; Szymanski, Boleslaw K. ; Tannenbaum, Peter ; Decyk, Viktor K.
Author_Institution :
Dept. of Comput. Sci., Rensselaer Polytech. Inst., Troy, NY, USA
Abstract :
We introduce a method for improving the cache performance of irregular computations in which data are referenced through run time defined indirection arrays. Such computations often arise in scientific problems. The presented method called Run-Time Reference Clustering (RTRC), is a run time analog of a compile time blocking used for dense matrix problems. RTRC uses the data partitioning and remapping techniques that are a part of distributed memory multiprocessor codes designed to minimize interprocessor communication. Remapping each set of local data decreases cache misses, the same way remapping the global data decreases off-processor references. We demonstrate the applicability and performance of the RTRC technique on several prevalent applications: Sparse Matrix-Vector Multiply, Particle-in-Cell, and CHARMM like codes. Performance results on SPARC-20, SP-2, and T3-D processors show that single node execution performance can be improved by as much as 35%
Keywords :
cache storage; distributed memory systems; fault tolerant computing; parallel programming; storage management; CHARMM like codes; Particle-in-Cell; RTRC; SPARC-20; Sparse Matrix-Vector Multiply; cache misses; cache performance optimization; compile time blocking; data partitioning; data referencing; dense matrix problems; distributed memory multiprocessor codes; global data remapping; interprocessor communication minimisation; irregular computations; local data remapping; off-processor references; remapping techniques; run time analog; run time defined indirection arrays; run time reference clustering; single node execution performance; Computer science; Laboratories; Optimization; Physics; Prefetching; Propulsion; Runtime; Simulated annealing; Sparse matrices; USA Councils;
Conference_Titel :
Parallel Algorithms/Architecture Synthesis, 1997. Proceedings., Second Aizu International Symposium
Conference_Location :
Aizu-Wakamatsu
Print_ISBN :
0-8186-7870-4
DOI :
10.1109/AISPAS.1997.581623