Title :
Higher order Q-Learning
Author :
Edwards, Ashley ; Pottenger, William M.
Author_Institution :
Dept. of Comput. Sci., Univ. of Georgia, Athens, GA, USA
Abstract :
Higher order learning is a statistical relational learning framework in which relationships between different instances of the same class are leveraged (Ganiz, Lytkin and Pottenger, 2009). Learning can be supervised or unsupervised. In contrast, reinforcement learning (Q-Learning) is a technique for learning in an unknown state space. Action selection is often based on a greedy, or epsilon greedy approach. The problem with this approach is that there is often a large amount of initial exploration before convergence. In this article we introduce a novel approach to this problem that treats a state space as a collection of data from which latent information can be extrapolated. From this data, we classify actions as leading to a high reward or low reward, and formulate behaviors based on this information. We provide experimental evidence that this technique drastically reduces the amount of exploration required in the initial stages of learning. We evaluate our algorithm in a well-known reinforcement learning domain, grid-world.
Keywords :
learning (artificial intelligence); epsilon greedy approach; higher order Q-learning; reinforcement learning; state space; statistical relational learning framework; Bayesian methods; Classification algorithms; Learning; Learning systems; Machine learning; Support vector machines; Artificial intelligence; Bayesian methods; Higher Order Learning; Intelligent agent; Machine learning; Q-learning; Reinforcement learning; Statistical relational learning;
Conference_Titel :
Adaptive Dynamic Programming And Reinforcement Learning (ADPRL), 2011 IEEE Symposium on
Conference_Location :
Paris
Print_ISBN :
978-1-4244-9887-1
DOI :
10.1109/ADPRL.2011.5967385