• DocumentCode
    3427224
  • Title

    An instantaneous vector representation of delta pitch for speaker-change prediction in conversational dialogue systems

  • Author

    Laskowski, Kornel ; Edlund, Jens ; Heldner, Mattias

  • Author_Institution
    interACT, Carnegie Mellon Univ., Pittsburgh, PA
  • fYear
    2008
  • fDate
    March 31 2008-April 4 2008
  • Firstpage
    5041
  • Lastpage
    5044
  • Abstract
    As spoken dialogue systems become deployed in increasingly complex domains, they face rising demands on the naturalness of interaction. We focus on system responsiveness, aiming to mimic human-like dialogue flow control by predicting speaker changes as observed in real human-human conversations. We derive an instantaneous vector representation of pitch variation and show that it is amenable to standard acoustic modeling techniques. Using a small amount of automatically labeled data, we train models which significantly outperform current state-of-the-art pause-only systems, and replicate to within 1% absolute the performance of our previously published hand-crafted baseline. The new system additionally offers scope for run-time control over the precision or recall of locations at which to speak.
  • Keywords
    acoustic signal processing; signal representation; speaker recognition; acoustic modeling; complex domains; delta pitch variation; human-human conversations; instantaneous vector representation; run-time control; speaker change prediction; spoken dialogue systems; Automatic control; Automatic speech recognition; Communication system control; Control systems; Delay; Humans; Loudspeakers; Oral communication; Predictive models; Runtime; Frequency domain analysis; Signal representation; Speech communication; Speech processing; User interfaces;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
  • Conference_Location
    Las Vegas, NV
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-1483-3
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2008.4518791
  • Filename
    4518791