Title :
Learning to control a joint driven double inverted pendulum using nested actor/critic algorithm
Author :
Kobori, Norimasa ; Suzuki, Kenji ; Hartono, Pitoyo ; Hashimoto, Shuji
Author_Institution :
Dept. of Appl. Phys., Waseda Univ., Tokyo, Japan
Abstract :
In recent years, ´Reinforcement Learning´ which can acquire reflective and adaptive actions, is becoming the center of attention as a learning method for robotics control. However, there are many unsolved problems that have to be cleared in order to put the method into practical use. One of the problems is the handling of the state space and the action space. Many algorithms of existing reinforcement learning deal with discrete state space and action space. When the unit of search space is rough, a subtle control cannot be achieved (imperfect perception). On the contrary, when the unit of search space is too fine, searching space is enlarged accordingly and the stable convergence of learning cannot be obtained (curse of dimensionality). In this paper, we propose a nested actor/critic algorithm that can deal with the continuous state and action space. The method proposed in this paper inserts a child actor/critic into the actor part of parent actor/critic algorithm. We examined the proposed algorithm for a stable control problem in both simulation and prototype model of a joint-driven double inverted pendulum.
Keywords :
learning (artificial intelligence); neurocontrollers; pendulums; radial basis function networks; robust control; simulation; state-space methods; Markov decision processes; RBF network function approximators; action space; child actor-critic; continuous control values; imperfect perception; joint driven double inverted pendulum; machine learning; nested actor-critic algorithm; parent actor-critic algorithm; reinforcement learning; simulation; stabilization control; stable control problem; state space; Control systems; Convergence; Grid computing; Learning systems; Machine learning; Machine learning algorithms; Orbital robotics; Physics; State-space methods; System performance;
Conference_Titel :
Neural Information Processing, 2002. ICONIP '02. Proceedings of the 9th International Conference on
Print_ISBN :
981-04-7524-1
DOI :
10.1109/ICONIP.2002.1201968