DocumentCode :
312148
Title :
A unified spectral transformation adaptation approach for robust speech recognition
Author :
Yao, Lei ; Yu, Dong ; Huang, Taiyi
Author_Institution :
Nat. Lab. of Pattern Recognition, Acad. Sinica, Beijing, China
Volume :
2
fYear :
1996
fDate :
3-6 Oct 1996
Firstpage :
981
Abstract :
Canonical correlation-based compensation (CCBC) is proposed as a unified approach to cope with the mismatch between training and test set. The mismatch between training and test conditions can be simply clustered into three classes: differences between speakers, changes in the recording channel and the effects of a noisy environment. In previous work (1995), we successfully used the CCBC approach, with some modifications, to make our speech recognizer robust to a noisy environment. Recently, the same approach has been extended for speaker and channel adaptation. The results of our experiments show that CCBC approach well compensates for all three kinds of distortion source between training and test conditions. In order to compare the performance of CCBC with that of some conventional adaptation approaches, the capacities of the techniques of cepstal mean normalization, RASTA and Lin-Log RASTA are tested. We find that CCBC has a better performance than all of them. As a very important problem in the CCBC approach, the selection of appropriate reference speech data is also discussed in this paper
Keywords :
audio recording; cepstral analysis; compensation; correlation methods; software performance evaluation; speech recognition; Lin-Log RASTA; canonical correlation-based compensation; cepstal mean normalization; channel adaptation; clustering; distortion source; noisy environment; performance; recording channel changes; reference speech data selection; robust speech recognition; speaker adaptation; speaker differences; training set/test set mismatch; unified spectral transformation adaptation approach; Acoustic distortion; Acoustic noise; Cepstral analysis; Loudspeakers; Robustness; Speech enhancement; Speech recognition; Testing; Vectors; Working environment noise;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
0-7803-3555-4
Type :
conf
DOI :
10.1109/ICSLP.1996.607767
Filename :
607767
Link To Document :
بازگشت