• DocumentCode
    3399768
  • Title

    A strategy for converging dynamic action policies

  • Author

    Ribeiro, Richardson ; Borges, André P. ; Koerich, Alessandro L. ; Scalabrin, Edson E. ; Enembreck, Fabricio

  • Author_Institution
    Univ. of Contestado-UnC, Mafra
  • fYear
    2009
  • fDate
    March 30 2009-April 2 2009
  • Firstpage
    136
  • Lastpage
    143
  • Abstract
    In this paper we propose a novel strategy for converging dynamic policies generated by adaptive agents, which receive and accumulate rewards for their actions. The goal of the proposed strategy is to speed up the convergence of such agents to a good policy in dynamic environments. Since it is difficult to have the good value for a state due to the continuous changing in the environment, previous policies are kept in memory for reuse in future policies, avoiding delays or unexpected speedups in the agent´s learning. Experimental results on dynamic environments with different policies have shown that the proposed strategy is able to speed up the convergence of the agent while achieving good action policies.
  • Keywords
    Markov processes; learning (artificial intelligence); multi-agent systems; Markov decision process; adaptive agents; dynamic action policies; dynamic environments; Computer science; Convergence; Decision making; Delay; Learning; State estimation; Stochastic processes; Adaptive Agents; Dynamic Environments; Reinforcement Learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Agents, 2009. IA '09. IEEE Symposium on
  • Conference_Location
    Nashville, TN
  • Print_ISBN
    978-1-4244-2767-3
  • Type

    conf

  • DOI
    10.1109/IA.2009.4927511
  • Filename
    4927511