Title :
An instantaneous vector representation of delta pitch for speaker-change prediction in conversational dialogue systems
Author :
Laskowski, Kornel ; Edlund, Jens ; Heldner, Mattias
Author_Institution :
interACT, Carnegie Mellon Univ., Pittsburgh, PA
fDate :
March 31 2008-April 4 2008
Abstract :
As spoken dialogue systems become deployed in increasingly complex domains, they face rising demands on the naturalness of interaction. We focus on system responsiveness, aiming to mimic human-like dialogue flow control by predicting speaker changes as observed in real human-human conversations. We derive an instantaneous vector representation of pitch variation and show that it is amenable to standard acoustic modeling techniques. Using a small amount of automatically labeled data, we train models which significantly outperform current state-of-the-art pause-only systems, and replicate to within 1% absolute the performance of our previously published hand-crafted baseline. The new system additionally offers scope for run-time control over the precision or recall of locations at which to speak.
Keywords :
acoustic signal processing; signal representation; speaker recognition; acoustic modeling; complex domains; delta pitch variation; human-human conversations; instantaneous vector representation; run-time control; speaker change prediction; spoken dialogue systems; Automatic control; Automatic speech recognition; Communication system control; Control systems; Delay; Humans; Loudspeakers; Oral communication; Predictive models; Runtime; Frequency domain analysis; Signal representation; Speech communication; Speech processing; User interfaces;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location :
Las Vegas, NV
Print_ISBN :
978-1-4244-1483-3
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2008.4518791