DocumentCode :
2831697
Title :
The study of q-logarithmic modulation spectral normalization for robust speech recognition
Author :
Fan, Hao-Teng ; Hsu, Che-hsien ; Hung, Jeih-weih
Author_Institution :
Dept. of Electr. Eng., Nat. Chi Nan Univ., Nantou, Taiwan
fYear :
2012
fDate :
June 30 2012-July 2 2012
Firstpage :
183
Lastpage :
186
Abstract :
This paper presents a novel use of the generalized logarithm operation (q-logarithm) in refining the modulation spectrum of speech features for noise-robust speech recognition. The resulting new method, generalized logarithmic modulation spectral mean normalization (GLMSMN), equalizes the average of the magnitude modulation spectrum in q-logarithmic domain for different utterances in order to alleviate the effect of noise. In the Aurora-2 connected-digit database and evaluation task, the presented GLMSMN operating on the MVN features reveals significant improvement in recognition accuracy in comparison with the MFCC baseline and MVN. The overall averaged recognition accuracy brought by GLMSMN can be nearly 90%.
Keywords :
speech recognition; visual databases; GLMSMN; digit database; generalized logarithm operation; generalized logarithmic modulation spectral mean normalization; magnitude modulation spectrum; modulation spectrum; noise robust speech recognition; q-logarithmic domain; q-logarithmic modulation spectral normalization; robust speech recognition; speech features; Accuracy; Mel frequency cepstral coefficient; Modulation; Noise; Robustness; Speech; Speech recognition; modulation spectrum; q-logarithm; robust speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
System Science and Engineering (ICSSE), 2012 International Conference on
Conference_Location :
Dalian, Liaoning
Print_ISBN :
978-1-4673-0944-8
Electronic_ISBN :
978-1-4673-0943-1
Type :
conf
DOI :
10.1109/ICSSE.2012.6257173
Filename :
6257173
Link To Document :
بازگشت