DocumentCode
3244692
Title
A hybrid barge-in procedure for more reliable turn-taking in human-machine dialog systems
Author
Rose, Richard C. ; Kim, Hong Kook
Author_Institution
AT&T Labs.-Res., USA
fYear
2003
fDate
30 Nov.-3 Dec. 2003
Firstpage
198
Lastpage
203
Abstract
This paper investigates techniques designed to allow the users of human-machine dialog systems to interrupt or barge-in over machine generated speech messages. An experimental study was performed on utterances collected from a telephone based dialog system to analyze the effect of barge-in performance on users´ speech. One result of this study was that excessive barge-in latencies resulted in disfluencies appearing in over half of users´ utterances. A hybrid procedure for barge-in detection is proposed and evaluated on the utterances collected from the same domain. The procedure combines a feature-based voice activity detection (VAD) algorithm with a model-based approach for verifying hypothesized speech segments. The procedure is shown in the paper to obtain better detection performance than procedures that rely on the speech recognition decoder to detect speech. It is also found to have latencies that are comparable to those obtained by low delay feature-based speech detection algorithms.
Keywords
feature extraction; speech processing; speech-based user interfaces; barge-in latency; feature-based voice activity detection; human-machine dialog systems; hybrid barge-in procedure; hypothesized speech segment verification; low delay speech detection; machine generated speech message interruption; speech analysis; turn-taking; user utterance disfluencies; Acoustic signal detection; Automatic speech recognition; Computer vision; Decoding; Delay; Event detection; Man machine systems; Protocols; Speech analysis; Telephony;
fLanguage
English
Publisher
ieee
Conference_Titel
Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
Print_ISBN
0-7803-7980-2
Type
conf
DOI
10.1109/ASRU.2003.1318428
Filename
1318428
Link To Document