مرکز منطقه ای اطلاع رساني علوم و فناوري - Neural reinforcement learning to swing-up and balance a real pole

DocumentCode :

2955893

Title :

Neural reinforcement learning to swing-up and balance a real pole

Author :

Riedmiller, Martin

Author_Institution :

Neuroinformatics Group, Osnabrueck Univ., Germany

Volume :

fYear :

2005

fDate :

10-12 Oct. 2005

Firstpage :

3191

Abstract :

This paper proposes a neural network based reinforcement learning controller that is able to learn control policies in a highly data efficient manner. This allows to apply reinforcement learning directly to real plants -neither a transition model nor a simulation model of the plant is needed for training. The only training information provided to the controller are transition experiences collected from interactions with the real plant. By storing these transition experiences explicitly, they can be reconsidered for updating the neural Q-function in every training step. This results in a stable learning process of a neural Q-value function. The algorithm is applied to learn the highly nonlinear and noisy task of swinging-up and balancing a real inverted pendulum. The amount of real time interaction needed to learn a highly effective policy from scratch was less than 14 minutes.

Keywords :

learning (artificial intelligence); neural nets; nonlinear control systems; control policy; inverted pendulum; learning process; neural Q-value function; neural network; neural reinforcement learning; reinforcement learning controller; Acceleration; Algorithm design and analysis; Learning; Multi-layer neural network; Multilayer perceptrons; Neural networks; Regression tree analysis; State-space methods; Stochastic systems; Stress;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Systems, Man and Cybernetics, 2005 IEEE International Conference on

Print_ISBN :

0-7803-9298-1

Type :

conf

DOI :

10.1109/ICSMC.2005.1571637

Filename :

1571637

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2955893