DocumentCode :
469051
Title :
Research on confusion network algorithm for Mandarin large vocabulary continuous speech recognition
Author :
Wu, Bin ; Liu, Gang ; Guo, Jun
Author_Institution :
Beijing Univ. of Posts & Telecommun., Beijing
Volume :
3
fYear :
2007
fDate :
2-4 Nov. 2007
Firstpage :
1080
Lastpage :
1084
Abstract :
Decoding based on the maximum a posterior probability (MAP) decision rule is usually used in mandarin large vocabulary continuous speech recognition, and the recognition results has the minimum sentence error rate. But word error rate (WER) is commonly used performance measure, so the decoding method based on the minimum Bayes risk decision rule has been proposed for optimizing the word error rate. One method of MBR decoding is that the word lattice can be transformed into confusion network in order to obtain the hypotheses with minimum WER. According to the characteristic of mandarin, we proposed an Chinese character confusion network generation algorithm based on previous works. Firstly, the Chinese word lattices can be produced using standard mandarin large vocabulary continuous speech recognizer; then the Chinese word lattice is analyzed and handled based on the Chinese language features, and an Chinese character lattice is made; lastly an Chinese character confusion network is produce by implementing multiple alignment in the Chinese character lattice. The experimental results based on 2005 HTRDP (863) evaluation corpus show that the proposed algorithm yields a lower WER than the MAP recognition and word confusion network decoding.
Keywords :
Bayes methods; character recognition; decoding; maximum likelihood estimation; natural language processing; speech recognition; Bayes risk decision rule; Chinese character confusion network generation algorithm; Chinese character lattice; Mandarin large vocabulary; continuous speech recognition; decoding method; maximum a posterior probability decision rule; sentence error rate; word error rate; Character generation; Character recognition; Decoding; Error analysis; Lattices; Natural languages; Optimization methods; Speech analysis; Speech recognition; Vocabulary; Chinese character confusion network; minimum bayes decision rule; word error rate;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Wavelet Analysis and Pattern Recognition, 2007. ICWAPR '07. International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-1065-1
Electronic_ISBN :
978-1-4244-1066-8
Type :
conf
DOI :
10.1109/ICWAPR.2007.4421593
Filename :
4421593
Link To Document :
بازگشت