A model based approach to exploration of continuous-state MDPs using Divergence-to-Go

Author

Matthew Emigh;Evan Kriminger;José C. Principe

Author_Institution

University of Florida, Department of Electrical and Computer Engineering, Gainesville, Florida 32611

fYear

2015

Firstpage

Lastpage

Abstract

In reinforcement learning, exploration is typically conducted by taking occasional random actions. The literature lacks an exploration method driven by uncertainty, in which exploratory actions explicitly seek to improve the learning process in a sequential decision problem. In this paper, we propose a framework called Divergence-to-Go, which is a model-based method that uses recursion similarly to dynamic programming to quantify the uncertainty associated with each state-action pair. Information-theoretic estimators of uncertainty allow our method to function even in large, continuous spaces. The performance is demonstrated on a maze and mountain car task.

Keywords

"Uncertainty","Kernel","Computational modeling","Measurement uncertainty","Markov processes","Learning (artificial intelligence)","Monte Carlo methods"

Publisher

ieee

Conference_Titel

Machine Learning for Signal Processing (MLSP), 2015 IEEE 25th International Workshop on

Type

conf

DOI

10.1109/MLSP.2015.7324371

Filename

7324371

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=3688650