Title :
Jointly recognizing multi-speaker conversations
Author :
Ji, Gang ; Bilmes, Jeff
Author_Institution :
Dept. of Electr. Eng., Univ. of Washington, Seattle, WA, USA
Abstract :
We suggest an approach to speech recognition where multiple sides of a conversation in a dialog or meeting are processed and decoded jointly rather than independently. We moreover introduce a practical implementation of this approach that demonstrates both language model perplexity and speech recognition word error rate improvements in conversational telephone speech. Specifically, we show that such benefits can be had if a n-gram language model, in addition to conditioning on immediately preceding words in an utterance, is also allowed to condition on the estimated dialog-act of the immediately preceding utterance of an alternate speaker.
Keywords :
computational linguistics; speech recognition; word processing; language model perplexity; multispeaker conversation; n-gram language model; speech recognition; word error rate; Decoding; Error analysis; Graphical models; Humans; Natural languages; Speech analysis; Speech recognition; Telephony; Timing; Vocabulary; Speech recognition; graphical models; multi-speaker;
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2010.5495041