مرکز منطقه ای اطلاع رساني علوم و فناوري - Optimistic planning for belief-augmented Markov Decision Processes

DocumentCode :

3269456

Title :

Optimistic planning for belief-augmented Markov Decision Processes

Author :

Fonteneau, Raphael ; Busoniu, L. ; Munos, Remi

Author_Institution :

Dept. of Electr. Eng. & Comput. Sci., Univ. of Liege, Liege, Belgium

fYear :

2013

fDate :

16-19 April 2013

Firstpage :

Lastpage :

Abstract :

This paper presents the Bayesian Optimistic Planning (BOP) algorithm, a novel model-based Bayesian reinforcement learning approach. BOP extends the planning approach of the Optimistic Planning for Markov Decision Processes (OP-MDP) algorithm [10], [9] to contexts where the transition model of the MDP is initially unknown and progressively learned through interactions within the environment. The knowledge about the unknown MDP is represented with a probability distribution over all possible transition models using Dirichlet distributions, and the BOP algorithm plans in the belief-augmented state space constructed by concatenating the original state vector with the current posterior distribution over transition models. We show that BOP becomes Bayesian optimal when the budget parameter increases to infinity. Preliminary empirical validations show promising performance.

Keywords :

Markov processes; belief networks; learning (artificial intelligence); planning (artificial intelligence); probability; BOP algorithm; Bayesian optimistic planning algorithm; Dirichlet distributions; OP-MDP; belief-augmented Markov decision processes; novel model-based Bayesian reinforcement learning approach; optimistic planning for Markov decision processes; probability distribution; Algorithm design and analysis; Bayes methods; Context; Context modeling; Dynamic programming; Learning (artificial intelligence); Planning;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Adaptive Dynamic Programming And Reinforcement Learning (ADPRL), 2013 IEEE Symposium on

Conference_Location :

Singapore

ISSN :

2325-1824

Type :

conf

DOI :

10.1109/ADPRL.2013.6614992

Filename :

6614992

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3269456