DocumentCode :
288691
Title :
A reinforcement learning approach to on-line optimal control
Author :
An, P.E. ; Aslam-Mir, S. ; Brown, M. ; Harris, C.J.
Author_Institution :
Dept. of Aeronaut. & Astronaut., Southampton Univ., UK
Volume :
4
fYear :
1994
fDate :
27 Jun-2 Jul 1994
Firstpage :
2465
Abstract :
Presents a hybrid control architecture for solving on-line optimal control. In this architecture, the control law is dynamically scheduled between a reinforcement controller and a stabilizing controller so that the closed-loop performance is smoothly transformed from a reactive behavior to one which can predict. Based on a modified Q-learning technique, the reinforcement controller is made of two components: policy and Q functions. The policy function is explicitly incorporated so as to bypass the minimum operator normally required for selecting actions and updating the Q function. This architecture is then applied to a repetitive operation using a second-order linear-time-variant plant with a nonlinear control structure. In this operation, the reinforcement signals are based on set-point errors and the reinforcement controller is generalized using second-order B-splines networks. This example illustrates how, for a, non-optimally tuned stabilizing controller, the closed-loop performance can be bootstrapped with the use of reinforcement learning. Results shows that the set-point performance of the hybrid controller is improved over that of the fixed structure controller by discovering better control strategies which compensate for the non-optimal gains and nonlinear control structure
Keywords :
learning (artificial intelligence); linear systems; nonlinear control systems; optimal control; splines (mathematics); time-varying systems; closed-loop performance; hybrid control architecture; modified Q-learning technique; nonlinear control structure; nonoptimally tuned stabilizing controller; online optimal control; reactive behavior; reinforcement controller; reinforcement learning; repetitive operation; second-order B-splines networks; second-order linear-time-variant plant; set-point errors; stabilizing controller; Control systems; Costs; Dynamic scheduling; Error correction; Kinematics; Nonlinear control systems; Optimal control; Sampling methods; Spline; Supervised learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks, 1994. IEEE World Congress on Computational Intelligence., 1994 IEEE International Conference on
Conference_Location :
Orlando, FL
Print_ISBN :
0-7803-1901-X
Type :
conf
DOI :
10.1109/ICNN.1994.374607
Filename :
374607
Link To Document :
بازگشت