DocumentCode :
2314459
Title :
On-line EM reinforcement learning
Author :
Yoshimoto, Junichiro ; Ishii, Shin ; Sato, Masaaki
Author_Institution :
Nara Inst. of Sci. & Technol., Ikoma, Japan
Volume :
3
fYear :
2000
fDate :
2000
Firstpage :
163
Abstract :
In this article, we propose a new reinforcement learning (RL) method for a system having continuous state and action spaces. Our RL method has an architecture like the actor-critic model. The critic tries to approximate the Q-function, which is the expected future return for the current state-action pair. The actor tries to approximate a stochastic soft-max policy defined by the Q-function. The soft-max policy is more likely to select an action that has a higher Q-function value. The online EM algorithm is used to train the critic and the actor. We apply this method to two control problems. Computer simulations show that our method is able to acquire fairly good control in the two tasks after a few learning trials
Keywords :
learning (artificial intelligence); maximum likelihood estimation; neural nets; online operation; Q-function approximation; RL method; actor-critic model; continuous action space; continuous state space; control problems; expectation-maximisation algorithm; expected future return; online EM reinforcement learning; reinforcement learning method; state-action pair; stochastic soft-max policy; Automatic control; Computer simulation; Function approximation; Humans; Information processing; Learning; Probability distribution; Space technology; Stochastic processes; Vectors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks, 2000. IJCNN 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on
Conference_Location :
Como
ISSN :
1098-7576
Print_ISBN :
0-7695-0619-4
Type :
conf
DOI :
10.1109/IJCNN.2000.861298
Filename :
861298
Link To Document :
بازگشت