مرکز منطقه ای اطلاع رساني علوم و فناوري - Two Novel On-policy Reinforcement Learning Algorithms based on TD(λ)-methods

DocumentCode :

2717665

Title :

Two Novel On-policy Reinforcement Learning Algorithms based on TD(λ)-methods

Author :

Wiering, Marco A. ; Van Hasselt, Hado

Author_Institution :

Dept. of Inf. & Comput. Sci., Utrecht Univ.

fYear :

2007

fDate :

1-5 April 2007

Firstpage :

280

Lastpage :

287

Abstract :

This paper describes two novel on-policy reinforcement learning algorithms, named QV(λ)-learning and the actor critic learning automaton (ACLA). Both algorithms learn a state value-function using TD(λ)-methods. The difference between the algorithms is that QV-learning uses the learned value function and a form of Q-learning to learn Q-values, whereas ACLA uses the value function and a learning automaton-like update rule to update the actor. We describe several possible advantages of these methods compared to other value-function-based reinforcement learning algorithms such as Q-learning, Sarsa, and conventional actor-critic methods. Experiments are performed on (1) small, (2) large, (3) partially observable, and (4) dynamic maze problems with tabular and neural network value-function representations, and on the mountain car problem. The overall results show that the two novel algorithms can outperform previously known reinforcement learning algorithms

Keywords :

learning (artificial intelligence); learning automata; Q-learning; Q-value; QV-learning; actor critic learning automaton; automaton-like update rule; neural network value function representation; on-policy reinforcement learning algorithm; state value function; value function-based reinforcement learning; Dynamic programming; Intelligent systems; Learning automata; Neural networks; Optimal control; Probability distribution; State estimation; Stochastic systems;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Approximate Dynamic Programming and Reinforcement Learning, 2007. ADPRL 2007. IEEE International Symposium on

Conference_Location :

Honolulu, HI

Print_ISBN :

1-4244-0706-0

Type :

conf

DOI :

10.1109/ADPRL.2007.368200

Filename :

4220845

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2717665