DocumentCode :
2553170
Title :
Leveraging gain normalization for sub-band temporal features in noise-robust speech recognition
Author :
Fan, Hao-Teng ; Hung, Jeih-weih
Author_Institution :
Dept. of Electr. Eng., Nat. Chi Nan Univ., Nantou, Taiwan
fYear :
2012
fDate :
29-31 May 2012
Firstpage :
1409
Lastpage :
1412
Abstract :
In this paper, we propose to operate the sub-band division via discrete wavelet transform (DWT) before the process of gain normalization (GN) in producing speech features. In the presented approach, we apply the DWT to decompose the temporal-domain cepstral feature sequence, and then perform the gain normalization on each sub-band feature stream. Finally, the new feature stream is obtained by the inverse DWT of all sub-band streams. Compared with the gain normalization process directly performed on the original full-band stream, the presented approach can deal with the sub-band distortions individually and is expected to be more noise-robust. In the Aurora-2 database and task, this new sub-band GN outperforms the baseline process and the original full-band GN by 65.51% and 18.20% in relative word error reduction.
Keywords :
discrete wavelet transforms; speech recognition; Aurora-2 database; discrete wavelet transform; full-band stream; gain normalization; noise-robust speech recognition; subband GN; subband distortions; subband feature stream; subband temporal features; temporal-domain cepstral feature sequence; word error reduction; Accuracy; Discrete wavelet transforms; Frequency modulation; Speech; Speech recognition; Training; discrete wavelet transform; robust speech feature; speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2012 9th International Conference on
Conference_Location :
Sichuan
Print_ISBN :
978-1-4673-0025-4
Type :
conf
DOI :
10.1109/FSKD.2012.6234339
Filename :
6234339
Link To Document :
بازگشت