A Recurrent Control Neural Network for Data Efficient Reinforcement Learning

Author

Schaefer, Anton Maximilian ; Udluft, Steffen ; Zimmermann, Hans-Georg

Author_Institution

Dept. of Optimisation & Operations Res., Ulm Univ.

fYear

2007

fDate

1-5 April 2007

Firstpage

151

Lastpage

157

Abstract

In this paper we introduce a new model-based approach for a data-efficient modelling and control of reinforcement learning problems in discrete time. Our architecture is based on a recurrent neural network (RNN) with dynamically consistent overshooting, which we extend by an additional control network. The latter has the particular task to learn the optimal policy. This approach has the advantage that by using a neural network we can easily deal with high-dimensions and consequently are able to break Bellman´s curse of dimensionality. Further due to the high system-identification quality of RNN our method is highly data-efficient. Because of its properties we refer to our new model as recurrent control neural network (RCNN). The network is tested on a standard reinforcement learning problem, namely the cart-pole balancing, where it shows especially in terms of data-efficiency outstanding results

Keywords

learning (artificial intelligence); recurrent neural nets; data efficient reinforcement learning; data-efficient modelling; discrete time systems; recurrent control neural network; Communication system control; Communications technology; Dynamic programming; Equations; Learning systems; Neural networks; Operations research; Recurrent neural networks; Telephony; Testing;

fLanguage

English

Publisher

ieee

Conference_Titel

Approximate Dynamic Programming and Reinforcement Learning, 2007. ADPRL 2007. IEEE International Symposium on

Conference_Location

Honolulu, HI

Print_ISBN

1-4244-0706-0

Type

conf

DOI

10.1109/ADPRL.2007.368182

Filename

4220827