DocumentCode :
2790161
Title :
Jointly recognizing multi-speaker conversations
Author :
Ji, Gang ; Bilmes, Jeff
Author_Institution :
Dept. of Electr. Eng., Univ. of Washington, Seattle, WA, USA
fYear :
2010
fDate :
14-19 March 2010
Firstpage :
5110
Lastpage :
5113
Abstract :
We suggest an approach to speech recognition where multiple sides of a conversation in a dialog or meeting are processed and decoded jointly rather than independently. We moreover introduce a practical implementation of this approach that demonstrates both language model perplexity and speech recognition word error rate improvements in conversational telephone speech. Specifically, we show that such benefits can be had if a n-gram language model, in addition to conditioning on immediately preceding words in an utterance, is also allowed to condition on the estimated dialog-act of the immediately preceding utterance of an alternate speaker.
Keywords :
computational linguistics; speech recognition; word processing; language model perplexity; multispeaker conversation; n-gram language model; speech recognition; word error rate; Decoding; Error analysis; Graphical models; Humans; Natural languages; Speech analysis; Speech recognition; Telephony; Timing; Vocabulary; Speech recognition; graphical models; multi-speaker;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
ISSN :
1520-6149
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2010.5495041
Filename :
5495041
Link To Document :
بازگشت