DocumentCode
392409
Title
Performing speech recognition on multiple parallel files using continuous hidden Markov models on an FPGA
Author
Melnikoff, S.J. ; Quigley, S.F. ; Russell, M.J.
Author_Institution
Electron., Electr. & Comput. Eng., Univ. of Birmingham, UK
fYear
2002
fDate
16-18 Dec. 2002
Firstpage
399
Lastpage
402
Abstract
Speech recognition is a computationally demanding task, particularly the stages which use Viterbi decoding for converting pre-processed speech data into words or subword unit, and the associated observation probability calculations, which employ multivariate Gaussian distributions; so any device that can reduce the load on, for example, a PC´s processor, is advantageous. Hence we present two implementations of a speech recognition system incorporating an FPGA, employing continuous hidden Markov models (HMMs), and capable of processing three speech files simultaneously. The first uses monophones, and can perform recognition 250 times real time (in terms of average time per observation), as well as outperforming its software equivalent. The second uses biphones and triphones, reducing the speedup to 13 times real time.
Keywords
Gaussian distribution; Viterbi decoding; field programmable gate arrays; hidden Markov models; probability; speech recognition; FPGA; Viterbi decoding; biphones; continuous hidden Markov models; monophones; multiple parallel files; multivariate Gaussian distributions; observation probability calculations; pre-processed speech data; speech files; speech recognition; triphones; Decoding; Distributed computing; Field programmable gate arrays; Gaussian distribution; Hidden Markov models; Probability; Software performance; Speech processing; Speech recognition; Viterbi algorithm;
fLanguage
English
Publisher
ieee
Conference_Titel
Field-Programmable Technology, 2002. (FPT). Proceedings. 2002 IEEE International Conference on
Print_ISBN
0-7803-7574-2
Type
conf
DOI
10.1109/FPT.2002.1188720
Filename
1188720
Link To Document