Abstract :
In this paper, three new systolic arrays for matrix multiplication are proposed. The first systolic array has the minimum number of 3n-2 clock cycles in completing a matrix multiplication among the known structures, with n^2 processors elements (PE´s). It is achieved by applying a new input data flow and deposition scheme. The second array is derived by combining the data flow technique with the simple Blahut´s matrix multiplication algorithm. Not only the second array has the least amount of processing time of3n-2 clock cycles, it has the least area complexity of about n^2 /2 PE´s. By further modifying its input data flow patterns, the third array is obtained. Its processing time is further reduced to 2.5n-2 clock cycles. The proposed architectures exhibit better performances than the known structures, according to several standard performance measures.