مرکز منطقه ای اطلاع رساني علوم و فناوري - Exploiting loudness dynamics in stochastic models of turn-taking

DocumentCode :

3131835

Title :

Exploiting loudness dynamics in stochastic models of turn-taking

Author :

Laskowski, K.

Author_Institution :

Carnegie Mellon Univ., Pittsburgh, PA, USA

fYear :

2012

fDate :

2-5 Dec. 2012

Firstpage :

Lastpage :

Abstract :

Stochastic turn-taking models have traditionally been implemented as N-grams, which condition predictions on recent binary-valued speech/non-speech contours. The current work re-implements this function using feed-forward neural networks, capable of accepting binary- as well as continuous-valued features; performance is shown to asymptotically approach that of the N-gram baseline as model complexity increases. The conditioning context is then extended to leverage loudness contours. Experiments indicate that the additional sensitivity to loudness considerably decreases average cross entropy rates on unseen data, by 0.03 bits per framing interval of 100 ms. This reduction is shown to make loudness-sensitive conversants capable of better predictions, with attention memory requirements at least 5 times smaller and responsiveness latency at least 10 times shorter than the loudness-insensitive baseline.

Keywords :

computational complexity; entropy; feedforward neural nets; speech recognition; speech synthesis; stochastic processes; N-gram baseline; attention memory requirements; binary-valued features; binary-valued speech-nonspeech contours; continuous-valued features; cross-entropy; feedforward neural networks; loudness contours; loudness dynamics; loudness-sensitive conversants; model complexity; spoken dialogue systems; stochastic turn-taking models; Artificial neural networks; Computational modeling; Context; Entropy; Speech; Standards; Stochastic processes; Interaction models; neural networks; prosody; spoken dialogue systems;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Spoken Language Technology Workshop (SLT), 2012 IEEE

Conference_Location :

Miami, FL

Print_ISBN :

978-1-4673-5125-6

Electronic_ISBN :

978-1-4673-5124-9

Type :

conf

DOI :

10.1109/SLT.2012.6424201

Filename :

6424201

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3131835