• DocumentCode
    423673
  • Title

    Incremental policy learning: an equilibrium selection algorithm for reinforcement learning agents with common interests

  • Author

    Fulda, Nancy ; Ventura, Dan

  • Author_Institution
    Dept. of Comput. Sci., Brigham Young Univ., Provo, UT, USA
  • Volume
    2
  • fYear
    2004
  • fDate
    25-29 July 2004
  • Firstpage
    1121
  • Abstract
    We present an equilibrium selection algorithm for reinforcement learning agents that incrementally adjusts the probability of executing each action based on the desirability of the outcome obtained in the last time step. The algorithm assumes that at least one coordination equilibrium exists and requires that the agents have a heuristic for determining whether or not the equilibrium was obtained. In deterministic environments with one or more strict coordination equilibria, the algorithm learns to play an optimal equilibrium as long as the heuristic is accurate. Empirical data demonstrate that the algorithm is also effective in stochastic environments and is able to learn good joint policies when the heuristic´s parameters are estimated during learning, rather than known in advance.
  • Keywords
    learning (artificial intelligence); multi-agent systems; optimisation; probability; stochastic processes; equilibrium selection algorithm; incremental policy learning; optimal equilibrium; probability; reinforcement learning agents; stochastic environments; Computer science; Learning; Minimax techniques; Parameter estimation; Stochastic processes; Taxonomy;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 2004. Proceedings. 2004 IEEE International Joint Conference on
  • ISSN
    1098-7576
  • Print_ISBN
    0-7803-8359-1
  • Type

    conf

  • DOI
    10.1109/IJCNN.2004.1380091
  • Filename
    1380091