مرکز منطقه ای اطلاع رساني علوم و فناوري - A near-optimal polynomial time algorithm for learning in certain classes of stochastic games Original Research Article

Title of article :

A near-optimal polynomial time algorithm for learning in certain classes of stochastic games Original Research Article

Author/Authors :

Craig Boutilier Ronen I. Brafman Carmel Domshlak Holger H. Hoos، نويسنده , , Moshe Tennenholtz، نويسنده ,

Issue Information :

روزنامه با شماره پیاپی سال 2000

Pages :

From page :

To page :

Abstract :

We present a new algorithm for polynomial time learning of optimal behavior in single-controller stochastic games. This algorithm incorporates and integrates important recent results of Kearns and Singh (Proc. ICML-98, 1998) in reinforcement learning and of Monderer and Tennenholtz (J. Artif. Intell. Res. 7, 1997, p. 231) in repeated games. In stochastic games, the agent must cope with the existence of an adversary whose actions can be arbitrary. In particular, this adversary can withhold information about the game matrix by refraining from (or rarely) performing certain actions. This forces upon us an exploration versus exploitation dilemma more complex than in Markov decision processes in which, given information about particular parts of a game matrix, the agent must decide how much effort to invest in learning the unknown parts of the matrix. We present a polynomial time algorithm that addresses these issues in the context of the class of single controller stochastic games, providing the agent with near-optimal return.

Keywords :

Exploration versus exploitation in multi-agent systems , Stochastic games , Polynomial time learning in hostile environments

Journal title :

Artificial Intelligence

Serial Year :

2000

Journal title :

Artificial Intelligence

Record number :

1206878

Link To Document :

https://search.isc.ac/dl/search/defaultta.aspx?DTC=10&DC=1206878