DocumentCode :
632853
Title :
Performance drawbacks for matrix multiplication using set associative cache in GPU devices
Author :
Djinevski, Leonid ; Arsenovski, Sime ; Ristov, Sasko ; Gusev, Marjan
Author_Institution :
FON Univ., Skopje, Macedonia
fYear :
2013
fDate :
20-24 May 2013
Firstpage :
193
Lastpage :
198
Abstract :
Performance of shared memory processors show negative performance impulses (drawbacks) in certain regions for execution of the basic matrix multiplication algorithm. In this paper we continue with analysis of GPU memory hierarchy and corresponding cache memory organization. We give a theoretical analysis why a negative performance impulse appears for specifics problem sizes. The main reason is the cache storage organization, i.e. the negative performance peak appears caused by mapping of matrix elements onto one cache set, instead of using the whole cache. The obtained experimental results prove our theoretical analysis. We also propose a method to avoid situations where performance drawbacks appear.
Keywords :
cache storage; content-addressable storage; graphics processing units; mathematics computing; matrix multiplication; memory architecture; performance evaluation; shared memory systems; GPU devices; GPU memory hierarchy; cache memory organization; cache storage organization; matrix element mapping; matrix multiplication algorithm; negative performance impulses; performance drawbacks; set associative cache; shared memory processor performance; Algorithm design and analysis; Cache memory; Computer architecture; Graphics processing units; Instruction sets; Organizations; Performance evaluation; Cache Memory; GPGPU; SIMD;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information & Communication Technology Electronics & Microelectronics (MIPRO), 2013 36th International Convention on
Conference_Location :
Opatija
Print_ISBN :
978-953-233-076-2
Type :
conf
Filename :
6596250
Link To Document :
بازگشت