DocumentCode
1743876
Title
Adaptive zero-sum stochastic game for two finite Markov chains
Author
Poznyak, A.S. ; Najim, K.
Author_Institution
Control Autom., CINVESTAV-IPN, Mexico City, Mexico
Volume
1
fYear
2000
fDate
2000
Firstpage
717
Abstract
A two finite Markov chains repeated zero-sum stochastic game with unknown transition matrices and payoffs is considered. The control objective is to obtain the equilibrium point based only on current measurements. The behavior of each players is modelled by a finite controlled Markov chain. A novel adaptive policy is developed based on Lagrange multipliers involved in a “learning through reinforcement” procedure. A regularized Lagrange function and a new normalization procedure are introduced. The saddle-point of this function is shown to be unique. The convergence properties are proved and the order of almost sure convergence is estimated as (n-1/3 )
Keywords
Lyapunov methods; Markov processes; convergence; matrix algebra; probability; stochastic games; adaptive policy; adaptive zero-sum stochastic game; control objective; convergence properties; equilibrium point; finite controlled Markov chain; normalization procedure; regularized Lagrange function; reinforcement learning; repeated game; saddle-point; Adaptive control; Automatic control; Convergence; Current measurement; Laboratories; Lagrangian functions; Process control; Programmable control; Recursive estimation; Stochastic processes;
fLanguage
English
Publisher
ieee
Conference_Titel
Decision and Control, 2000. Proceedings of the 39th IEEE Conference on
Conference_Location
Sydney, NSW
ISSN
0191-2216
Print_ISBN
0-7803-6638-7
Type
conf
DOI
10.1109/CDC.2000.912852
Filename
912852
Link To Document