DocumentCode :
3538685
Title :
Monte-Carlo utility estimates for Bayesian reinforcement learning
Author :
Dimitrakakis, Christos
Author_Institution :
Chalmers Univ. of Technol., Gothenburg, Sweden
fYear :
2013
fDate :
10-13 Dec. 2013
Firstpage :
7303
Lastpage :
7308
Abstract :
This paper discusses algorithms for Monte-Carlo Bayesian reinforcement learning. Firstly, Monte-Carlo estimates of upper bounds on the Bayes-optimal value function are used to construct an optimistic policy. Secondly, gradient-based algorithms for approximate bounds are introduced. Finally, a new class of gradient algorithms for Bayesian Bellman error minimisation is proposed. Theoretically, it is shown that the gradient methods are sound. Experiments demonstrate the superiority of the upper bound method in terms of reward obtained. However, the Bayesian Bellman error method is a close second, despite its computational simplicity.
Keywords :
Monte Carlo methods; belief networks; gradient methods; learning (artificial intelligence); utility theory; Bayes-optimal value function; Bayesian Bellman error minimisation; Bayesian reinforcement learning; Monte-Carlo utility estimates; approximate bounds; computational simplicity; gradient-based algorithms; optimistic policy; upper bound estimation; Computational modeling;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Decision and Control (CDC), 2013 IEEE 52nd Annual Conference on
Conference_Location :
Firenze
ISSN :
0743-1546
Print_ISBN :
978-1-4673-5714-2
Type :
conf
DOI :
10.1109/CDC.2013.6761048
Filename :
6761048
Link To Document :
بازگشت