• DocumentCode
    3538685
  • Title

    Monte-Carlo utility estimates for Bayesian reinforcement learning

  • Author

    Dimitrakakis, Christos

  • Author_Institution
    Chalmers Univ. of Technol., Gothenburg, Sweden
  • fYear
    2013
  • fDate
    10-13 Dec. 2013
  • Firstpage
    7303
  • Lastpage
    7308
  • Abstract
    This paper discusses algorithms for Monte-Carlo Bayesian reinforcement learning. Firstly, Monte-Carlo estimates of upper bounds on the Bayes-optimal value function are used to construct an optimistic policy. Secondly, gradient-based algorithms for approximate bounds are introduced. Finally, a new class of gradient algorithms for Bayesian Bellman error minimisation is proposed. Theoretically, it is shown that the gradient methods are sound. Experiments demonstrate the superiority of the upper bound method in terms of reward obtained. However, the Bayesian Bellman error method is a close second, despite its computational simplicity.
  • Keywords
    Monte Carlo methods; belief networks; gradient methods; learning (artificial intelligence); utility theory; Bayes-optimal value function; Bayesian Bellman error minimisation; Bayesian reinforcement learning; Monte-Carlo utility estimates; approximate bounds; computational simplicity; gradient-based algorithms; optimistic policy; upper bound estimation; Computational modeling;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Decision and Control (CDC), 2013 IEEE 52nd Annual Conference on
  • Conference_Location
    Firenze
  • ISSN
    0743-1546
  • Print_ISBN
    978-1-4673-5714-2
  • Type

    conf

  • DOI
    10.1109/CDC.2013.6761048
  • Filename
    6761048