• DocumentCode
    320659
  • Title

    Speed up reinforcement learning between two agents with adaptive mimetism

  • Author

    Yamaguchi, Tomohiro ; Tanaka, Yasuhiro ; Yachida, Masahiko

  • Author_Institution
    Dept. of Syst. & Human Sci., Osaka Univ., Japan
  • Volume
    2
  • fYear
    1997
  • fDate
    7-11 Sep 1997
  • Firstpage
    594
  • Abstract
    To realize a speed up in learning without homogenizing the agents´ behaviors in a multi-agent system, it is important to selectively share learning results. This paper describes a method designed to permit multiple agents to learn cooperatively. The advantage of our method is to dynamically switch the learning mode between mimetism and reinforcement learning according to the situation. Mimetism seeks stability in its behavior, while individual reinforcement leaning seeks the better solution. Accordingly, selective mimetism that allows the agents to partially share learning results,works to prevent homogenization among the agents. Experimental results are given for a ball-pushing task between the two virtual agents for evaluating the effectiveness of our method. This method will be useful for cooperative reinforcement learning with adaptive mimetism based on propagating the learned behaviors of a virtual agent to a physical robot in order to accelerate leaning in a physical environment
  • Keywords
    cooperative systems; decision theory; intelligent control; learning (artificial intelligence); adaptive mimetism; ball-pushing task; cooperative learning; learning mode; multi-agent system; reinforcement learning; virtual agents; Convergence; Costs; Learning systems; Performance analysis; Robots; State-space methods; Stochastic processes; Switches; Testing; Virtual environment;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Robots and Systems, 1997. IROS '97., Proceedings of the 1997 IEEE/RSJ International Conference on
  • Conference_Location
    Grenoble
  • Print_ISBN
    0-7803-4119-8
  • Type

    conf

  • DOI
    10.1109/IROS.1997.655072
  • Filename
    655072