DocumentCode :
337476
Title :
Improved methods for vocal tract normalization
Author :
Welling, L. ; Kanthak, S. ; Ney, H.
Author_Institution :
Tech. Hochschule Aachen, Germany
Volume :
2
fYear :
1999
fDate :
15-19 Mar 1999
Firstpage :
761
Abstract :
This paper presents improved methods for vocal tract normalization (VTN) along with experimental tests on three databases. We propose a new method for VTN in training: by using acoustic models with single Gaussian densities per state for selecting the normalization scales the need for the models to learn the normalization scales of the training speakers is avoided. We show that using single Gaussian densities for selecting the normalization scales in training results in lower error rates than using mixture densities. For VTN in recognition, we propose an improvement of the well-known multiple-pass strategy: by using an unnormalized acoustic model for the first recognition pass instead of a normalized model lower error rates are obtained. In recognition tests, this method is compared with a fast variant of VTN. The multiple-pass strategy is an efficient method but it is suboptimal because the normalization scale and the word sequence are determined sequentially. We found that for telephone digit string recognition this suboptimality reduces the VTN gain in recognition performance by 30% relative. On the German spontaneous scheduling task Verbmobil, the WSJ task and the German telephone digit string corpus SieTill the proposed methods for VTN reduce the error rates significantly
Keywords :
Gaussian processes; speech processing; speech recognition; German spontaneous scheduling task; German telephone digit string corpus; SieTill; Verbmobil; WSJ task; databases; error rates; experimental tests; improved methods; mixture densities; multiple-pass strategy; normalization scales; recognition performance; recognition tests; single Gaussian densities; speech recognition; suboptimal method; telephone digit string recognition; training speakers; unnormalized acoustic model; vocal tract normalization; word sequence; Acoustic testing; Databases; Error analysis; Frequency; Loudspeakers; Performance gain; Piecewise linear techniques; Speech; Telephony;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1999. Proceedings., 1999 IEEE International Conference on
Conference_Location :
Phoenix, AZ
ISSN :
1520-6149
Print_ISBN :
0-7803-5041-3
Type :
conf
DOI :
10.1109/ICASSP.1999.759780
Filename :
759780
Link To Document :
بازگشت