Title :
To explore continuous action space in actor/critic architecture
Author :
Yu, Wenwei ; Iijima, Daisuke ; Yokoi, Hiroshi ; Kakazu, Yukinori
Author_Institution :
Dept. of Complex Syst. Eng., Hokkaido Univ., Sapporo, Japan
Abstract :
Reinforcement learning is a commonly used paradigm for learning in autonomous dynamical systems. A popular topic in this field is how to extend the RL, into the continuous state space and action space, so that RL can be applied to more real world problems. The ASE/ACE which is one of the most famous implementations of RL, shows the possibility to be one solution. However the convergence of the scheme is proved to be slower than the method based on discrete state and action space, such as Q-learning methods. The reason is clear, since the continuous state and action need to be organized to reduce the indefinite searching to definite. On the other hand, there exists few RL systems exploring the action space by combining the effective action sequences to catch the regularity of the environment and thus to be reusable. We add a memory-based sequence structure, and correspondingly an adaptive action sequence critic to the actor/critic architecture to organize the action space, By generating and organizing the action sequences in continuous action space, the new model is able to improve the learning speed and acquire the environment-oriented skill. Experiments to solve a bench-mark double integrator problem and a 2-dimensional complicated problem are carried out to show the effectiveness of the new model
Keywords :
convergence; learning (artificial intelligence); pattern clustering; sequences; ASE/ACE; action space; actor/critic architecture; adaptive action sequence critic; autonomous dynamical systems; bench-mark double integrator problem; continuous action space; environment-oriented skill; memory-based sequence structure; reinforcement learning; Disk recording; Learning; Memory architecture; Neural networks; Organizing; Space exploration; State-space methods; Systems engineering and theory;
Conference_Titel :
Systems, Man, and Cybernetics, 1999. IEEE SMC '99 Conference Proceedings. 1999 IEEE International Conference on
Conference_Location :
Tokyo
Print_ISBN :
0-7803-5731-0
DOI :
10.1109/ICSMC.1999.815591