• DocumentCode
    1743876
  • Title

    Adaptive zero-sum stochastic game for two finite Markov chains

  • Author

    Poznyak, A.S. ; Najim, K.

  • Author_Institution
    Control Autom., CINVESTAV-IPN, Mexico City, Mexico
  • Volume
    1
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    717
  • Abstract
    A two finite Markov chains repeated zero-sum stochastic game with unknown transition matrices and payoffs is considered. The control objective is to obtain the equilibrium point based only on current measurements. The behavior of each players is modelled by a finite controlled Markov chain. A novel adaptive policy is developed based on Lagrange multipliers involved in a “learning through reinforcement” procedure. A regularized Lagrange function and a new normalization procedure are introduced. The saddle-point of this function is shown to be unique. The convergence properties are proved and the order of almost sure convergence is estimated as (n-1/3 )
  • Keywords
    Lyapunov methods; Markov processes; convergence; matrix algebra; probability; stochastic games; adaptive policy; adaptive zero-sum stochastic game; control objective; convergence properties; equilibrium point; finite controlled Markov chain; normalization procedure; regularized Lagrange function; reinforcement learning; repeated game; saddle-point; Adaptive control; Automatic control; Convergence; Current measurement; Laboratories; Lagrangian functions; Process control; Programmable control; Recursive estimation; Stochastic processes;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Decision and Control, 2000. Proceedings of the 39th IEEE Conference on
  • Conference_Location
    Sydney, NSW
  • ISSN
    0191-2216
  • Print_ISBN
    0-7803-6638-7
  • Type

    conf

  • DOI
    10.1109/CDC.2000.912852
  • Filename
    912852