Title :
A dynamical policy search model for matching law
Author :
Zhenbo, Cheng ; Zhidong, Deng
Author_Institution :
Dept. of Comput. Sci., Tsinghua Univ., Beijing, China
Abstract :
The matching law states that the fraction of choices made to any option will match the fraction of total rewards earned from that option. However, the income earned from conducting the matching behavior does not imply that it will get the optimal reward. It is unclear why subjects frequently exhibit the matching behavior rather than the optimal behavior. In this study, on the basis of the policy search model in reinforcement learning, an optimal algorithm is proposed, and the policy algorithm leading to matching law is derived from the optimal algorithm. Theoretical analysis and simulation results show that the decision behavior achieved by our algorithm is able to reach matching law in many kinds of reward schedules. Our results indicate that matching law can be exhibited whenever the subject tries to maximize a value function under a simple assumption that past choice behavior does not care about the values of future long-run reward. This results unveil the relationships between the matching behavior and the algorithm of optimal policy search.
Keywords :
behavioural sciences; learning (artificial intelligence); search problems; decision behavior; dynamical policy search model; income; matching law; reinforcement learning;
Conference_Titel :
Bio-Inspired Computing: Theories and Applications (BIC-TA), 2010 IEEE Fifth International Conference on
Conference_Location :
Changsha
Print_ISBN :
978-1-4244-6437-1
DOI :
10.1109/BICTA.2010.5645345