DocumentCode
2153597
Title
Model-free Q-learning designs for discrete-time zero-sum games with application to H-infinity control
Author
Al-Tamimi, Asma ; Lewis, Frank L. ; Abu-Khalaf, Murad
Author_Institution
Autom. & Robot. Res. Inst., Univ. of Texas at Arlington, Fort Worth, TX, USA
fYear
2007
fDate
2-5 July 2007
Firstpage
1668
Lastpage
1675
Abstract
In this paper, the optimal strategies for discrete-time linear system quadratic zero-sum games related to the H-infinity optimal control problem are solved in forward time without knowing the system dynamical matrices. The idea is to solve for an action dependent value function Q(x,u,w) of the zero-sum game instead of solving for the state dependent value function V(x) which satisfies a corresponding game algebraic Riccati equation (GARE). Since the state and actions spaces are continuous, two action networks and one critic network are used that are adaptively tuned in forward time using adaptive critic methods. The result is a Q-learning approximate dynamic programming model-free approach that solves the zero-sum game forward in time. It is shown that the critic converges to the game value function and the action networks converge to the Nash equilibrium of the game. Proofs of convergence of the algorithm are shown. It is proven that the algorithm ends up to be a model-free iterative algorithm to solve the (GARE) of the linear quadratic discrete-time zero-sum game. The effectiveness of this method is shown by performing an H-infinity control autopilot design for an F-16 aircraft.
Keywords
H∞ control; Riccati equations; approximation theory; discrete time systems; dynamic programming; game theory; learning (artificial intelligence); matrix algebra; quadratic programming; F-16 aircraft; GARE; H-infinity control application; H-infinity control autopilot design; H-infinity optimal control problem; Nash equilibrium; Q-learning approximate dynamic programming model free approach; action dependent value function; adaptive critic methods; discrete time linear system quadratic zero sum games; dynamical matrices system; game algebraic Riccati equation; linear quadratic discrete time zero sum game; model free Q-learning designs; model-free iterative algorithm; optimal strategies; state dependent value function; Dynamic programming; Equations; Game theory; Games; Heuristic algorithms; Mathematical model; Optimal control; Adaptive control; Adaptive critics; Approximate dynamic programming; H∞ optimal control; Policy iterations; Q-function; Q-learning; Zero-sum games;
fLanguage
English
Publisher
ieee
Conference_Titel
Control Conference (ECC), 2007 European
Conference_Location
Kos
Print_ISBN
978-3-9524173-8-6
Type
conf
Filename
7068263
Link To Document