مرکز منطقه ای اطلاع رساني علوم و فناوري

DocumentCode :

2498705

Title :

Higher order Q-Learning

Author :

Edwards, Ashley ; Pottenger, William M.

Author_Institution :

Dept. of Comput. Sci., Univ. of Georgia, Athens, GA, USA

fYear :

2011

fDate :

11-15 April 2011

Firstpage :

128

Lastpage :

134

Abstract :

Higher order learning is a statistical relational learning framework in which relationships between different instances of the same class are leveraged (Ganiz, Lytkin and Pottenger, 2009). Learning can be supervised or unsupervised. In contrast, reinforcement learning (Q-Learning) is a technique for learning in an unknown state space. Action selection is often based on a greedy, or epsilon greedy approach. The problem with this approach is that there is often a large amount of initial exploration before convergence. In this article we introduce a novel approach to this problem that treats a state space as a collection of data from which latent information can be extrapolated. From this data, we classify actions as leading to a high reward or low reward, and formulate behaviors based on this information. We provide experimental evidence that this technique drastically reduces the amount of exploration required in the initial stages of learning. We evaluate our algorithm in a well-known reinforcement learning domain, grid-world.

Keywords :

learning (artificial intelligence); epsilon greedy approach; higher order Q-learning; reinforcement learning; state space; statistical relational learning framework; Bayesian methods; Classification algorithms; Learning; Learning systems; Machine learning; Support vector machines; Artificial intelligence; Bayesian methods; Higher Order Learning; Intelligent agent; Machine learning; Q-learning; Reinforcement learning; Statistical relational learning;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Adaptive Dynamic Programming And Reinforcement Learning (ADPRL), 2011 IEEE Symposium on

Conference_Location :

Paris

Print_ISBN :

978-1-4244-9887-1

Type :

conf

DOI :

10.1109/ADPRL.2011.5967385

Filename :

5967385

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2498705