• DocumentCode
    1807009
  • Title

    A 40-nm 168-mW 2.4×-real-time VLSI processor for 60-kWord continuous speech recognition

  • Author

    He, Guangji ; Sugahara, Takanobu ; Izumi, Shintaro ; Kawaguchi, Hiroshi ; Yoshimoto, Masahiko

  • Author_Institution
    Kobe Univ., Kobe, Japan
  • fYear
    2012
  • fDate
    9-12 Sept. 2012
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    This paper describes a low-power VLSI chip for speaker-independent 60-kWord continuous speech recognition based on a context-dependent Hidden Markov Model (HMM). Our implementation includes a compression-decoding scheme to reduce the external memory bandwidth for Gaussian Mixture Model (GMM) computation and multi-path Viterbi transition units. We optimize the internal SRAM size using the max-approximation GMM calculation and adjusting the number of look-ahead frames. The test chip, fabricated in 40 nm CMOS technology, occupies 1.77 mm × 2.18 mm containing 2.52 M transistors for logic and 4.29 Mbit on-chip memory. The measured results show that our implementation achieves 34.2% required frequency reduction (83.3 MHz) and reduces 48.5% power consumption (74.14 mW) for 60 k-Word real-time continuous speech recognition compared to the previous work. This chip can maximally process 2.4× faster than real-time at 200 MHz and 1.1 V with power consumption of 168 mW.
  • Keywords
    CMOS memory circuits; Gaussian processes; SRAM chips; VLSI; Viterbi decoding; hidden Markov models; speaker recognition; speech coding; CMOS technology; GMM; Gaussian mixture model; VLSI processor; compression-decoding scheme; context-dependent hidden Markov model; frequency 200 MHz; frequency 83.3 MHz; internal SRAM size; look-ahead frames; low-power VLSI chip; max-approximation GMM calculation; multipath Viterbi transition units; on-chip memory; power 168 mW; power 74.14 mW; size 40 nm; speaker-independent continuous speech recognition; storage capacity 4.29 Mbit; transistors; voltage 1.1 V; Field programmable gate arrays; Hidden Markov models; Random access memory; Real-time systems; Speech recognition; Very large scale integration; Viterbi algorithm; 40 nm VLSI; hidden Markov model (HMM); large vocabulary continuous speech recognition (LVCSR) 2.4×;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Custom Integrated Circuits Conference (CICC), 2012 IEEE
  • Conference_Location
    San Jose, CA
  • ISSN
    0886-5930
  • Print_ISBN
    978-1-4673-1555-5
  • Electronic_ISBN
    0886-5930
  • Type

    conf

  • DOI
    10.1109/CICC.2012.6330678
  • Filename
    6330678