DocumentCode :
2256710
Title :
Scalability investigation of Mat-Core processor
Author :
Soliman, Mostafa I. ; Al-Junaid, Abdulmajid F.
Author_Institution :
Electr. Eng. Dept., South Valley Univ., Aswan, Egypt
fYear :
2010
fDate :
19-22 Dec. 2010
Firstpage :
447
Lastpage :
450
Abstract :
Mat-Core is a research processor aiming at exploiting the increasingly number of transistors per IC to improve the performance of a wide range of applications. It extends a general-purpose scalar processor with a matrix unit for processing vector/matrix data. The extended matrix unit is decoupled into two components to hide memory latency: address generation and data computation, which communicate through data queues. This paper investigates the scalability of Mat-Core architecture with different number of parallel lanes (one, four, and eight) on some linear algebra kernels. These kernels include scalar-vector multiplication, SAXPY, Givens rotation, rank-1 update, vector-matrix multiplication, and matrix-matrix multiplication. A cycle accurate model of Mat-Core processor is implemented using SystemC (system level modeling language). Four versions of Mat-Core processor are implemented and evaluated to show its scalability. These versions include Mat-Core with single lane and 8-element vector registers, four lanes with 4 × 4 matrix registers, four lanes with 8 × 4 matrix registers, and eight lanes with 8 × 8 matrix registers. The first version (single lane with 8-element vector registers) exploits only scalar and vector ISA whereas the other versions can exploit the three levels of Mat-Core ISA (scalar/vector/matrix ISA). Our results show that increasing the number of parallel lanes from one to four and then to eight speeds up the execution of the six kernels by factors of 3.6x - 4.8x and 7.94x - 10.6x, respectively, which indicates the scalability of Mat-Core architecture. Moreover, the maximum performance of the Mat-Core processor on matrix-matrix multiplication represents 90% of the ideal value.
Keywords :
linear algebra; matrix multiplication; microprocessor chips; vectors; Givens rotation; SAXPY; SystemC; linear algebra kernel; mat-core processor; matrix-matrix multiplication; rank-1 update; scalability investigation; scalar processor; scalar-vector multiplication; system level modeling language; vector-matrix data; vector-matrix multiplication; Computer architecture; Computers; Kernel; Pipelines; Registers; Scalability; Vectors; high performance computing; performance evaluation; scalable architecture; vector/matrix processing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Microelectronics (ICM), 2010 International Conference on
Conference_Location :
Cairo
Print_ISBN :
978-1-61284-149-6
Type :
conf
DOI :
10.1109/ICM.2010.5696184
Filename :
5696184
Link To Document :
بازگشت