• DocumentCode
    3232811
  • Title

    A dynamical policy search model for matching law

  • Author

    Zhenbo, Cheng ; Zhidong, Deng

  • Author_Institution
    Dept. of Comput. Sci., Tsinghua Univ., Beijing, China
  • fYear
    2010
  • fDate
    23-26 Sept. 2010
  • Firstpage
    127
  • Lastpage
    131
  • Abstract
    The matching law states that the fraction of choices made to any option will match the fraction of total rewards earned from that option. However, the income earned from conducting the matching behavior does not imply that it will get the optimal reward. It is unclear why subjects frequently exhibit the matching behavior rather than the optimal behavior. In this study, on the basis of the policy search model in reinforcement learning, an optimal algorithm is proposed, and the policy algorithm leading to matching law is derived from the optimal algorithm. Theoretical analysis and simulation results show that the decision behavior achieved by our algorithm is able to reach matching law in many kinds of reward schedules. Our results indicate that matching law can be exhibited whenever the subject tries to maximize a value function under a simple assumption that past choice behavior does not care about the values of future long-run reward. This results unveil the relationships between the matching behavior and the algorithm of optimal policy search.
  • Keywords
    behavioural sciences; learning (artificial intelligence); search problems; decision behavior; dynamical policy search model; income; matching law; reinforcement learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bio-Inspired Computing: Theories and Applications (BIC-TA), 2010 IEEE Fifth International Conference on
  • Conference_Location
    Changsha
  • Print_ISBN
    978-1-4244-6437-1
  • Type

    conf

  • DOI
    10.1109/BICTA.2010.5645345
  • Filename
    5645345