Title :
Adaptive Dynamic Programming algorithm for finding online the equilibrium solution of the two-player zero-sum differential game
Author :
Vrabie, Draguna ; Lewis, Frank
Author_Institution :
Autom. & Robot. Res. Inst., Univ. of Texas at Arlington, Fort Worth, TX, USA
Abstract :
This paper will present an Approximate/Adaptive Dynamic Programming (ADP) algorithm for determining online the Nash equilibrium solution for the two-player zero-sum differential game with linear dynamics and infinite horizon quadratic cost. The algorithm is built around an iterative method that has been developed in the control engineering community for solving the continuous-time game algebraic Riccati equation (CT-GARE) that is underlying the game problem. We here show how the ADP techniques will enhance the capabilities of the offline method allowing an online solution without the requirement of complete knowledge of the system dynamics. While working in the framework of control applications we will be referring to the two players as controller and disturbance. Both players are competing in real time and the equilibrium solution policies will be determined based on online measured data from the system. The two players are not learning concurrently. The algorithm is built on interplay between a learning phase, performed by the controller that is learning in order to optimize its behavior, and a policy update step, performed by the disturbance that is increasing its detrimental effect. The update of the disturbance policy will give way for further improvement for the, no longer optimal, controller policy. The control policy will be learned online using a continuous-time heuristic dynamic programming procedure. The feasibility of the ADP scheme is demonstrated in simulation on a power system. The goal is to determine the best control policy that will face in an optimal manner the highest load disturbance.
Keywords :
Riccati equations; control engineering computing; dynamic programming; game theory; heuristic programming; iterative methods; learning (artificial intelligence); power engineering computing; power system control; ADP techniques; Nash equilibrium solution; adaptive dynamic programming algorithm; approximate dynamic programming algorithm; continuous-time game algebraic Riccati equation; continuous-time heuristic dynamic programming procedure; control engineering community; detrimental effect; disturbance policy; infinite horizon quadratic cost; iterative method; linear dynamics; load disturbance; optimal controller policy; power system; two-player zero-sum differential game; Games; H infinity control; Heuristic algorithms; Iterative methods; Mathematical model; Optimal control; Riccati equations;
Conference_Titel :
Neural Networks (IJCNN), The 2010 International Joint Conference on
Conference_Location :
Barcelona
Print_ISBN :
978-1-4244-6916-1
DOI :
10.1109/IJCNN.2010.5596754