DocumentCode :
2498705
Title :
Higher order Q-Learning
Author :
Edwards, Ashley ; Pottenger, William M.
Author_Institution :
Dept. of Comput. Sci., Univ. of Georgia, Athens, GA, USA
fYear :
2011
fDate :
11-15 April 2011
Firstpage :
128
Lastpage :
134
Abstract :
Higher order learning is a statistical relational learning framework in which relationships between different instances of the same class are leveraged (Ganiz, Lytkin and Pottenger, 2009). Learning can be supervised or unsupervised. In contrast, reinforcement learning (Q-Learning) is a technique for learning in an unknown state space. Action selection is often based on a greedy, or epsilon greedy approach. The problem with this approach is that there is often a large amount of initial exploration before convergence. In this article we introduce a novel approach to this problem that treats a state space as a collection of data from which latent information can be extrapolated. From this data, we classify actions as leading to a high reward or low reward, and formulate behaviors based on this information. We provide experimental evidence that this technique drastically reduces the amount of exploration required in the initial stages of learning. We evaluate our algorithm in a well-known reinforcement learning domain, grid-world.
Keywords :
learning (artificial intelligence); epsilon greedy approach; higher order Q-learning; reinforcement learning; state space; statistical relational learning framework; Bayesian methods; Classification algorithms; Learning; Learning systems; Machine learning; Support vector machines; Artificial intelligence; Bayesian methods; Higher Order Learning; Intelligent agent; Machine learning; Q-learning; Reinforcement learning; Statistical relational learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Adaptive Dynamic Programming And Reinforcement Learning (ADPRL), 2011 IEEE Symposium on
Conference_Location :
Paris
Print_ISBN :
978-1-4244-9887-1
Type :
conf
DOI :
10.1109/ADPRL.2011.5967385
Filename :
5967385
Link To Document :
بازگشت