DocumentCode :
284610
Title :
Fast speaker adaptation combined with soft vector quantization in an HMM speech recognition system
Author :
Class, F. ; Kaltenmeir, A. ; Regal-Brietzmann, P. ; Trottler, K.
Author_Institution :
Daimler Benz AG, Ulm, Germany
Volume :
1
fYear :
1992
fDate :
23-26 Mar 1992
Firstpage :
461
Abstract :
The authors describe a method for combining speaker adaptation by feature vector transformation with semi-continuous hidden Markov modeling (SCHMM). Since the reference speaker´s voice is represented in the SCHMM system by multidimensional Gaussian distributions, it is these distributions rather than feature vectors that must be transformed. The performance of hard-decision vector quantization (HVQ), soft-decision VQ (SVQ), and SCHMM are compared as are the speaker-adaptive and speaker-independent systems. In addition, the influence of dynamic features is investigated. The definition of subword units is optimized, and, with respect to full or diagonal covariance matrices and codebook size, the SCHMM system is optimized. Model initialization and distribution reestimation during training is introduced. Significant improvements are obtained compared to previously reported systems based on HVQ: from 71.6% to 84.6% (speaker-independent) and from 80.4% to 87.4% (speaker-adaptive) mean recognition rate under difficult conditions
Keywords :
hidden Markov models; speech recognition; vector quantisation; codebook size; diagonal covariance matrices; distribution reestimation; dynamic features; feature vector transformation; full covariance matrices; hard-decision VQ; mean recognition rate; model initialisation; multidimensional Gaussian distributions; reference speaker; semi-continuous HMM; semi-continuous hidden Markov modeling; soft-decision VQ; speaker adaptation; speaker adaptive systems; speaker-independent systems; subword units; training; vector quantization; Covariance matrix; Gaussian distribution; Hidden Markov models; Legged locomotion; Multidimensional systems; Speech recognition; System testing; Vector quantization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1992. ICASSP-92., 1992 IEEE International Conference on
Conference_Location :
San Francisco, CA
ISSN :
1520-6149
Print_ISBN :
0-7803-0532-9
Type :
conf
DOI :
10.1109/ICASSP.1992.225872
Filename :
225872
Link To Document :
بازگشت