Title :
On combining frequency warping and spectral shaping in HMM based speech recognition
Author :
Potamianos, Alexandros ; Rose, Richard C.
Author_Institution :
AT&T Bell Labs., Murray Hill, NJ, USA
Abstract :
Frequency warping approaches to speaker normalization have been proposed and evaluated on various speech recognition tasks. These techniques have been found to significantly improve performance even for speaker independent recognition from short utterances over the telephone network. In maximum likelihood (ML) based model adaptation a linear transformation is estimated and applied to the model parameters in order to increase the likelihood of the input utterance. The purpose of this paper is to demonstrate that significant advantage can be gained by performing frequency warping and ML speaker adaptation in a unified framework. A procedure is described which compensates utterances by simultaneously scaling the frequency axis and reshaping the spectral energy contour. This procedure is shown to reduce the error rate in a telephone based connected digit recognition task by 30-40%
Keywords :
hidden Markov models; maximum likelihood estimation; spectral analysis; speech recognition; telephony; HMM; connected digit recognition; error rate reduction; frequency warping; input utterance; linear transformation; maximum likelihood speaker adaptation; parameter estimation; short utterances; speaker independent recognition; speaker normalization; spectral energy contour reshaping; spectral shaping; speech recognition; telephone network; Acoustic distortion; Acoustic transducers; Adaptation model; Automatic speech recognition; Error analysis; Frequency estimation; Hidden Markov models; Maximum likelihood estimation; Speech recognition; Telephony;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on
Conference_Location :
Munich
Print_ISBN :
0-8186-7919-0
DOI :
10.1109/ICASSP.1997.596178