مرکز منطقه ای اطلاع رساني علوم و فناوري - Monte-Carlo utility estimates for Bayesian reinforcement learning

DocumentCode :

3538685

Title :

Monte-Carlo utility estimates for Bayesian reinforcement learning

Author :

Dimitrakakis, Christos

Author_Institution :

Chalmers Univ. of Technol., Gothenburg, Sweden

fYear :

2013

fDate :

10-13 Dec. 2013

Firstpage :

7303

Lastpage :

7308

Abstract :

This paper discusses algorithms for Monte-Carlo Bayesian reinforcement learning. Firstly, Monte-Carlo estimates of upper bounds on the Bayes-optimal value function are used to construct an optimistic policy. Secondly, gradient-based algorithms for approximate bounds are introduced. Finally, a new class of gradient algorithms for Bayesian Bellman error minimisation is proposed. Theoretically, it is shown that the gradient methods are sound. Experiments demonstrate the superiority of the upper bound method in terms of reward obtained. However, the Bayesian Bellman error method is a close second, despite its computational simplicity.

Keywords :

Monte Carlo methods; belief networks; gradient methods; learning (artificial intelligence); utility theory; Bayes-optimal value function; Bayesian Bellman error minimisation; Bayesian reinforcement learning; Monte-Carlo utility estimates; approximate bounds; computational simplicity; gradient-based algorithms; optimistic policy; upper bound estimation; Computational modeling;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Decision and Control (CDC), 2013 IEEE 52nd Annual Conference on

Conference_Location :

Firenze

ISSN :

0743-1546

Print_ISBN :

978-1-4673-5714-2

Type :

conf

DOI :

10.1109/CDC.2013.6761048

Filename :

6761048

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3538685