DocumentCode :
2432324
Title :
Adaptive linear quadratic control using policy iteration
Author :
Bradtke, Steven J. ; Ydstie, B. Erik ; Barto, Andrew G.
Author_Institution :
Dept. of Comput. & Inf. Sci., Massachusetts Univ., Amherst, MA, USA
Volume :
3
fYear :
1994
fDate :
29 June-1 July 1994
Firstpage :
3475
Abstract :
In this paper we present the stability and convergence results for dynamic programming-based reinforcement learning applied to linear quadratic regulation (LQR). The specific algorithm we analyze is based on Q-learning and it is proven to converge to an optimal controller provided that the underlying system is controllable and a particular signal vector is persistently excited. This is the first convergence result for DP-based reinforcement learning algorithms for a continuous problem.
Keywords :
adaptive control; discrete time systems; dynamic programming; intelligent control; iterative methods; learning (artificial intelligence); linear quadratic control; multivariable systems; stability; Q-learning; adaptive linear quadratic control; convergence; discrete time systems; dynamic programming-based reinforcement learning; multivariable system; optimal controller; policy iteration; signal vector; stability; Adaptive control; Computer science; Control systems; Cost function; Feedback control; Learning; Optimal control; Programmable control; Symmetric matrices; Vectors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
American Control Conference, 1994
Print_ISBN :
0-7803-1783-1
Type :
conf
DOI :
10.1109/ACC.1994.735224
Filename :
735224
Link To Document :
بازگشت