DocumentCode
1865563
Title
Cache performance and algorithm optimization
Author
Xiangzhen, Qiao
Author_Institution
Inst. of Comput. Technol., Acad. Sinica, Beijing, China
fYear
1997
fDate
28 Apr-2 May 1997
Firstpage
12
Lastpage
17
Abstract
A technique to enhance the cache performance of some blocked algorithms is proposed. According to the results of number theory, the author presents a principle for array padding so that accesses of array subblocks do not generate conflict misses. The technique is used to calculate LU factorization and matrix multiplication. The principle is tested on a shared memory multiprocessor. The practical results agree with the theoretical analysis, and 20% to 150% increasing in performance is achieved
Keywords
cache storage; matrix multiplication; number theory; optimisation; reduced instruction set computing; shared memory systems; subroutines; LU factorization; algorithm optimization; array padding; array subblock access; blocked algorithms; cache performance; matrix multiplication; number theory; performance; shared memory multiprocessor; Computers; Concurrent computing; Data structures; Delay; Optimization methods; Parallel algorithms; Performance analysis; Reduced instruction set computing; System testing;
fLanguage
English
Publisher
ieee
Conference_Titel
High Performance Computing on the Information Superhighway, 1997. HPC Asia '97
Conference_Location
Seoul
Print_ISBN
0-8186-7901-8
Type
conf
DOI
10.1109/HPC.1997.592114
Filename
592114
Link To Document