DocumentCode
423673
Title
Incremental policy learning: an equilibrium selection algorithm for reinforcement learning agents with common interests
Author
Fulda, Nancy ; Ventura, Dan
Author_Institution
Dept. of Comput. Sci., Brigham Young Univ., Provo, UT, USA
Volume
2
fYear
2004
fDate
25-29 July 2004
Firstpage
1121
Abstract
We present an equilibrium selection algorithm for reinforcement learning agents that incrementally adjusts the probability of executing each action based on the desirability of the outcome obtained in the last time step. The algorithm assumes that at least one coordination equilibrium exists and requires that the agents have a heuristic for determining whether or not the equilibrium was obtained. In deterministic environments with one or more strict coordination equilibria, the algorithm learns to play an optimal equilibrium as long as the heuristic is accurate. Empirical data demonstrate that the algorithm is also effective in stochastic environments and is able to learn good joint policies when the heuristic´s parameters are estimated during learning, rather than known in advance.
Keywords
learning (artificial intelligence); multi-agent systems; optimisation; probability; stochastic processes; equilibrium selection algorithm; incremental policy learning; optimal equilibrium; probability; reinforcement learning agents; stochastic environments; Computer science; Learning; Minimax techniques; Parameter estimation; Stochastic processes; Taxonomy;
fLanguage
English
Publisher
ieee
Conference_Titel
Neural Networks, 2004. Proceedings. 2004 IEEE International Joint Conference on
ISSN
1098-7576
Print_ISBN
0-7803-8359-1
Type
conf
DOI
10.1109/IJCNN.2004.1380091
Filename
1380091
Link To Document