DocumentCode :
451136
Title :
Cache-Optimal Methods for Bit-Reversals
Author :
Zhang, Zhao ; Zhang, Xiaodong
Author_Institution :
College of William and Mary
fYear :
1999
fDate :
13-18 Nov. 1999
Firstpage :
26
Lastpage :
26
Abstract :
Bit-reversals are representative and important data reordering operations in many scientific computations. Performance degradation is mainly caused by cache conflict misses. Bit-reversals are often repeatedly used as fundamental subroutines for many scientific programs. Thus, in order to gain the best performance, cache-optimal methods and their implementations should be carefully and precisely done at the programming level. This type of performance programming for some special programs, such as the data reorderings, may significantly outperform an optimization from an automatic tool, such as a compiler. In this paper, we examine different methods using techniques of blocking, buffering, and padding for efficient implementations. We evaluate the merits and limits of each technique and their application and architecture-dependent conditions for developing cache-optimal methods. We present two contributions in this paper: (1) Our integrated blocking methods, which match cache associativity and TLB cache size and which fully use the available registers are cache-optimal and fast. (2) We show that our padding methods outperform other software oriented methods, and believe they are the fastest in terms of minimizing both CPU and memory access cycles. Since the padding methods are almost independent of hardware, they could be widely used on many uniprocessor workstations and SMP multiprocessors.
Keywords :
Algorithms; Application software; Automatic programming; Degradation; Hardware; Optimizing compilers; Performance gain; Program processors; Registers; Workstations;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Supercomputing, ACM/IEEE 1999 Conference
Print_ISBN :
1-58113-091-0
Type :
conf
DOI :
10.1109/SC.1999.10059
Filename :
1592669
Link To Document :
بازگشت