مرکز منطقه ای اطلاع رساني علوم و فناوري - Utility Based Q-learning to Maintain Cooperation in Prisoner´s Dilemma Games

DocumentCode :

468384

Title :

Utility Based Q-learning to Maintain Cooperation in Prisoner´s Dilemma Games

Author :

Moriyama, Koichi

Author_Institution :

Osaka Univ., Osaka

fYear :

2007

fDate :

2-5 Nov. 2007

Firstpage :

146

Lastpage :

152

Abstract :

This work deals with Q-learning in a multiagent environment. There are many multiagent Q-learning methods, and most of them aim to converge to a Nash equilibrium, which is not desirable in games like the prisoner´s dilemma (PD). However, normal Q-learning agents that use a stochastic method in choosing actions to avoid local optima may bring mutual cooperation in PD. Although such mutual cooperation usually occurs singly, it can be maintained if the Q- function of cooperation becomes larger than that of defection after the cooperation. This work derives a theorem on how many times the cooperation is needed to make the Q- function of cooperation larger than that of defection. In addition, from the perspective of the author´s previous works that discriminate utilities from rewards and use utilities for learning in PD, this work also derives a corollary on how much utility is necessary to make the Q-function larger by one-shot mutual cooperation.

Keywords :

game theory; learning (artificial intelligence); multi-agent systems; Nash equilibrium; multiagent Q-learning methods; multiagent environment; prisoner´s dilemma; stochastic method; utility based Q-learning; Game theory; Intelligent agent; Learning systems; Machine learning; Multiagent systems; Nash equilibrium; Stochastic processes; Toy industry;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Intelligent Agent Technology, 2007. IAT '07. IEEE/WIC/ACM International Conference on

Conference_Location :

Fremont, CA

Print_ISBN :

978-0-7695-3027-7

Type :

conf

DOI :

10.1109/IAT.2007.60

Filename :

4407275

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=468384