• DocumentCode
    5466
  • Title

    A Systolic Array Based GTD Processor With a Parallel Algorithm

  • Author

    Chia-Hsiang Yang ; Chun-Wei Chou ; Chia-Shen Hsu ; Chiao-En Chen

  • Author_Institution
    Dept. of Electron. Eng., Nat. Chiao Tung Univ., Hsinchu, Taiwan
  • Volume
    62
  • Issue
    4
  • fYear
    2015
  • fDate
    Apr-15
  • Firstpage
    1099
  • Lastpage
    1108
  • Abstract
    Generalized triangular decomposition (GTD) has been found to be useful in the field of signal processing, but the feasibility of the related hardware has not yet been established. This paper presents (for the first time) a GTD processor architecture with a parallel algorithm. The proposed parallel GTD algorithm achieves an increase in speed of up to 1.66 times, compared to the speed of its conventional sequential counterpart for an 8 ×8 matrix. For hardware implementation, the proposed reconfigurable architecture is capable of computing singular value decomposition (SVD), geometric mean decomposition (GMD), and GTD for matrix sizes from 1 ×1 to 8 ×8. The proposed GTD processor is composed of 16 processing cores in a heterogeneous systolic array. Computations are distributed over area-efficient coordinate rotation digital computers (CORDICs) to achieve a high throughput. To establish the validity of the concept, a GTD processor was designed and implemented. The latency constraint of 16 μs specified in the 802.11ac standard is adopted for the hardware realization. The proposed design achieves a maximum throughput of 83.3k matrices/s for an 8 ×8 matrix at 112.4 MHz. The estimated power and core area are 172.7 mW and 1.96 mm2, respectively, based on standard 90 nm CMOS technology.
  • Keywords
    CMOS integrated circuits; digital arithmetic; parallel algorithms; signal processing; singular value decomposition; systolic arrays; wireless LAN; 802.11ac standard; CMOS technology; CORDIC; GMD; GTD processor; SVD; complementary metal oxide semiconductor; coordinate rotation digital computer; frequency 112.4 MHz; generalized triangular decomposition; geometric mean decomposition; heterogeneous systolic array; latency constraint; matrix factorization; parallel algorithm; power 172.7 mW; processing core; signal processing; singular value decomposition; size 90 nm; Algorithm design and analysis; Arrays; Hardware; Matrix decomposition; Transceivers; Vectors; Generalized triangular decomposition (GTD); geometric mean decomposition (GMD); multiple-input multiple-output (MIMO); reconfigurable architecture;
  • fLanguage
    English
  • Journal_Title
    Circuits and Systems I: Regular Papers, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1549-8328
  • Type

    jour

  • DOI
    10.1109/TCSI.2015.2388831
  • Filename
    7070890