A 6 mW, 5,000-Word Real-Time Speech Recognizer Using WFST Models

Author

Price, Michael ; Glass, James ; Chandrakasan, Anantha P.

Author_Institution

Electr. Eng. & Comput. Sci. Dept., Massachusetts Inst. of Technol., Cambridge, MA, USA

Volume

50

Issue

1

fYear

2015

fDate

Jan. 2015

Firstpage

102

Lastpage

112

Abstract

We describe an IC that provides a local speech recognition capability for a variety of electronic devices. We start with a generic speech decoder architecture that is programmable with industry-standard WFST and GMM speech models. Algorithm and architectural enhancements are incorporated in order to achieve real-time performance amid system-level constraints on internal memory size and external memory bandwidth. A 2.5 × 2.5 mm test chip implementing this architecture was fabricated using a 65 nm process. The chip performs a 5,000 word recognition task in real-time with 13.0% word error rate, 6.0 mW core power consumption, and a search efficiency of approximately 16 nJ per hypothesis.

Keywords

Gaussian processes; speech recognition; GMM speech models; WFST models; external memory bandwidth; generic speech decoder architecture; internal memory size; local speech recognition capability; real-time speech recognizer; Bandwidth; Decoding; Hidden Markov models; Random access memory; Real-time systems; Speech; Speech recognition; CMOS digital integrated circuits; Gaussian mixture models (GMM); low-power electronics; speech recognition; weighted finite-state transducers (WFST);

fLanguage

English

Journal_Title

Solid-State Circuits, IEEE Journal of

Publisher

ieee

ISSN

0018-9200

Type

jour

DOI

10.1109/JSSC.2014.2367818

Filename

6975250