مرکز منطقه ای اطلاع رساني علوم و فناوري - On-line policy optimisation of spoken dialogue systems via live interaction with human subjects

DocumentCode :

3646039

Title :

On-line policy optimisation of spoken dialogue systems via live interaction with human subjects

Author :

Milica Gašić;Filip Jurčíček;Blaise Thomson;Kai Yu;Steve Young

Author_Institution :

Cambridge University Engineering Department, Trumpington St, CB1 2PZ, UK

fYear :

2011

Firstpage :

312

Lastpage :

317

Abstract :

Statistical dialogue models have required a large number of dialogues to optimise the dialogue policy, relying on the use of a simulated user. This results in a mismatch between training and live conditions, and significant development costs for the simulator thereby mitigating many of the claimed benefits of such models. Recent work on Gaussian process reinforcement learning, has shown that learning can be substantially accelerated. This paper reports on an experiment to learn a policy for a real-world task directly from human interaction using rewards provided by users. It shows that a usable policy can be learnt in just a few hundred dialogues without needing a user simulator and, using a learning strategy that reduces the risk of taking bad actions. The paper also investigates adaptation behaviour when the system continues learning for several thousand dialogues and highlights the need for robustness to noisy rewards.

Keywords :

"Training","Error analysis","Learning systems","Gaussian processes","Kernel","Humans","Robustness"

Publisher :

ieee

Conference_Titel :

Automatic Speech Recognition and Understanding (ASRU), 2011 IEEE Workshop on

Print_ISBN :

978-1-4673-0365-1

Type :

conf

DOI :

10.1109/ASRU.2011.6163950

Filename :

6163950

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3646039