DocumentCode :
2791618
Title :
Using duration and pitch for mandarin digit string recognition
Author :
Zhao, Rui ; Kida, Yusuke ; Yan, Xiang ; Ding, Pei ; He, Lei
Author_Institution :
Toshiba (China) R&D Center, Beijing, China
fYear :
2010
fDate :
14-19 March 2010
Firstpage :
4846
Lastpage :
4849
Abstract :
Mandarin digit string recognition (MDSR) is a challenge because there exist many difficulties in acoustic discrimination for such a small vocabulary speech recognition task. In this paper, we propose to improve MDSR performance by using duration and pitch information. Speech rate dispersion is used to involve duration knowledge and is incorporated in the MDSR system by rescoring the N-best candidates in a two-pass framework. estimated with a robust pitch extraction method is also adopted to improve the acoustic discrimination among Mandarin digits. The experimental results show both duration and pitch significantly improve the performance, and the combination of them gives further improvement. Moreover, our methods are robust to background noise. In the evaluation, the sentence error rate is reduced by 50.43% on average over different SNR conditions.
Keywords :
acoustic signal processing; speech recognition; vocabulary; MDSR performance; acoustic discrimination; duration; mandarin digit string recognition; pitch; two pass framework; vocabulary speech recognition; Background noise; Data mining; Error analysis; Noise robustness; Signal to noise ratio; Speech recognition; Vocabulary; Mandarin digit string recognition; duration; pitch;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
ISSN :
1520-6149
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2010.5495128
Filename :
5495128
Link To Document :
بازگشت