Adaptive bases for Q-learning

Author

Castro, Dotan Di ; Mannor, Shie

Author_Institution

Fac. of Electr. Eng., Technion - Israel Inst. of Technol., Haifa, Israel

fYear

2010

fDate

15-17 Dec. 2010

Firstpage

4587

Lastpage

4593

Abstract

We consider reinforcement learning, and in particular, the Q-learning algorithm in large state and action spaces. In order to cope with the size of the spaces, a function approximation approach to the state and action value function is needed. We generalize the classical Q-learning algorithm to an algorithm where the basis of the linear function approximation change dynamically while interacting with the environment. A motivation for such an approach is maximizing the state-action value function fitness to the problem faced, thus obtaining better performance. The algorithm is shown to converge using two time scales stochastic approximation. Finally, we discuss how this technique can be applied to a rich family of RL algorithms with linear function approximation.

Keywords

function approximation; learning (artificial intelligence); state-space methods; stochastic processes; Q-learning algorithm; RL algorithm; action space; linear function approximation; reinforcement learning; state space; state-action value function fitness; stochastic approximation; Approximation algorithms; Convergence; Equations; Function approximation; Linear approximation; Stochastic processes;

fLanguage

English

Publisher

ieee

Conference_Titel

Decision and Control (CDC), 2010 49th IEEE Conference on

Conference_Location

Atlanta, GA

ISSN

0743-1546

Print_ISBN

978-1-4244-7745-6

Type

conf

DOI

10.1109/CDC.2010.5717385

Filename

5717385