Title :
Parameter sharing in subband likelihood-maximizing beamforming for speech recognition using microphone arrays
Author :
Seltzer, Michael L. ; Stern, Richard M.
Author_Institution :
Speech Technol. Group, Microsoft Res., Redmond, WA, USA
Abstract :
In this paper, we present methods to improve the computational efficiency of our previously proposed algorithm for microphone array processing for speech recognition, called subband likelihood-maximizing beamforming (S-LIMABEAM). In S-LIMABEAM, the parameters of a subband filter-and-sum beamformer are optimized to maximize the likelihood of the correct transcription of the utterance, as measured by the speech recognizer itself. This approach has been shown to produce significant improvements in recognition accuracy over conventional array processing methods in a variety of noisy and reverberant environments. However, because of the manner in which recognition features are computed, the number of subband parameters that have to be jointly optimized may be large, which slows the convergence of the algorithm. To address this problem, we present two methods of sharing parameters among multiple subband filters in order to significantly reduce the number of parameters to be optimized. Both of these methods exploit the spectral smoothing that occurs in the feature extraction process, but do so in different ways. By sharing parameters in the proposed manner, we are able to obtain a significant reduction in the time to convergence of S-LIMABEAM with a minimal degradation in speech recognition accuracy.
Keywords :
array signal processing; convergence of numerical methods; feature extraction; hidden Markov models; maximum likelihood estimation; microphones; smoothing methods; spectral analysis; speech recognition; HMM; S-LIMABEAM; computational efficiency; convergence; feature extraction; maximum likelihood estimation; microphone array processing; parameter sharing; recognition accuracy; reverberant environments; spectral smoothing; speech recognition; speech recognizer; subband filter-and-sum beamformer; subband likelihood-maximizing beamforming; Array signal processing; Computational efficiency; Convergence; Feature extraction; Filters; Microphone arrays; Optimization methods; Smoothing methods; Speech recognition; Working environment noise;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
Print_ISBN :
0-7803-8484-9
DOI :
10.1109/ICASSP.2004.1326127