DocumentCode
2700500
Title
Discriminative Training of Decoding Graphs for Large Vocabulary Continuous Speech Recognition
Author
Kuo, H.J. ; Kingsbury, Brian ; Zweig, Geoffrey
Author_Institution
IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
Volume
4
fYear
2007
fDate
15-20 April 2007
Abstract
Finite-state decoding graphs integrate the decision trees, pronunciation model and language model for speech recognition into a unified representation of the search space. We explore discriminative training of the transition weights in the decoding graph in the context of large vocabulary speech recognition. In preliminary experiments on the RT-03 English Broadcast News evaluation set, the word error rate was reduced by about 5.7% relative, from 23.0% to 21.7%. We discuss how this method is particularly applicable to low-latency and low-resource applications such as real-time closed captioning of broadcast news and interactive speech-to-speech translation.
Keywords
decision trees; decoding; natural language processing; speech coding; speech recognition; RT-03 English Broadcast News evaluation set; decision trees; decoding graphs; discriminative training; finite-state decoding graphs; interactive speech-to-speech translation; language model; large vocabulary continuous speech recognition; pronunciation model; word error rate; Broadcasting; Context modeling; Decision trees; Error analysis; Hidden Markov models; Maximum likelihood decoding; Maximum likelihood estimation; Natural languages; Speech recognition; Vocabulary; Discriminative training; Finite-state decoding graph; Language model; Low-resource speech recognition; Pronunciation model;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
Conference_Location
Honolulu, HI
ISSN
1520-6149
Print_ISBN
1-4244-0727-3
Type
conf
DOI
10.1109/ICASSP.2007.367159
Filename
4218033
Link To Document