مرکز منطقه ای اطلاع رساني علوم و فناوري

DocumentCode :

2314459

Title :

On-line EM reinforcement learning

Author :

Yoshimoto, Junichiro ; Ishii, Shin ; Sato, Masaaki

Author_Institution :

Nara Inst. of Sci. & Technol., Ikoma, Japan

Volume :

fYear :

2000

fDate :

2000

Firstpage :

163

Abstract :

In this article, we propose a new reinforcement learning (RL) method for a system having continuous state and action spaces. Our RL method has an architecture like the actor-critic model. The critic tries to approximate the Q-function, which is the expected future return for the current state-action pair. The actor tries to approximate a stochastic soft-max policy defined by the Q-function. The soft-max policy is more likely to select an action that has a higher Q-function value. The online EM algorithm is used to train the critic and the actor. We apply this method to two control problems. Computer simulations show that our method is able to acquire fairly good control in the two tasks after a few learning trials

Keywords :

learning (artificial intelligence); maximum likelihood estimation; neural nets; online operation; Q-function approximation; RL method; actor-critic model; continuous action space; continuous state space; control problems; expectation-maximisation algorithm; expected future return; online EM reinforcement learning; reinforcement learning method; state-action pair; stochastic soft-max policy; Automatic control; Computer simulation; Function approximation; Humans; Information processing; Learning; Probability distribution; Space technology; Stochastic processes; Vectors;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Neural Networks, 2000. IJCNN 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on

Conference_Location :

Como

ISSN :

1098-7576

Print_ISBN :

0-7695-0619-4

Type :

conf

DOI :

10.1109/IJCNN.2000.861298

Filename :

861298

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2314459