• DocumentCode
    561207
  • Title

    Adaptive Profit Sharing Reinforcement Learning Method for Dynamic Environment

  • Author

    Koujaku, Sadamori ; Watanabe, Kota ; Igarashi, Hajime

  • Author_Institution
    Grad. Sch. of Inf. Sci. Technol., Hokkaido Univ., Sapporo, Japan
  • Volume
    1
  • fYear
    2011
  • fDate
    18-21 Dec. 2011
  • Firstpage
    462
  • Lastpage
    465
  • Abstract
    In this paper, an Adaptive Forgettable Profit Sharing reinforcement learning method is introduced. This method enables agents to adapt the environmental changes very quickly. It can be used to learn the robust and effective actions in the uncertain environments which have the non-Markov property, especially the partial observable Markov process (POMDP). Profit Sharing learns rational policy that is easy to be learned and results in good behavior in POMDP. However, the policy becomes worse in the dynamic and huge environment that changes frequently and require the lots of actions to achieve the goal. In order to handle such kind of environment, the forgetting, which gives the adaptability and rationality to Profit Sharing, is implemented. This method allows the agent to forget past experiences that reduce the rationality of its policy. The usefulness of the proposed algorithm is demonstrated through the numerical examples.
  • Keywords
    Markov processes; incentive schemes; learning (artificial intelligence); adaptive forgettable profit sharing reinforcement learning method; dynamic environment; nonMarkov property; partial observable Markov process; Educational institutions; Information science; Learning; Learning systems; Markov processes; Robustness; Reinforcement Learning; forgetting; rational theorem;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Applications and Workshops (ICMLA), 2011 10th International Conference on
  • Conference_Location
    Honolulu, HI
  • Print_ISBN
    978-1-4577-2134-2
  • Type

    conf

  • DOI
    10.1109/ICMLA.2011.25
  • Filename
    6147020