مرکز منطقه ای اطلاع رساني علوم و فناوري - Multiple timescales PIA for cooperative reinforcement learning based on MDP model

DocumentCode :

2644535

Title :

Multiple timescales PIA for cooperative reinforcement learning based on MDP model

Author :

Yamaguchi, Tomohiro ; Imatani, Eri

Author_Institution :

Nara Nat. Coll. of Technol., Nara

fYear :

2007

fDate :

17-20 Sept. 2007

Firstpage :

2785

Lastpage :

2791

Abstract :

This paper describes a new method of dynamic programming (DP) based multiagent reinforcement learning in Markov decision process (MDP) model. It is difficult for agents to learn cooperative actions among agents properly in multiagent because they may change each policy in same time. To solve this problem, each agent should learn in different time for each policy improvement. Therefore, we propose multiple timescales policy improvement method. We show comparative experiments between multiple timescales policy improvement and exclusive policy improvement. As a result, our methods reduced the search costs for the optimal common-payoff Nash solution.

Keywords :

Markov processes; decision theory; iterative methods; learning (artificial intelligence); multi-agent systems; Markov decision process model; cooperative reinforcement learning; dynamic programming; multiagent reinforcement learning; multiple timescales policy iteration algorithm; optimal common-payoff Nash solution; Artificial intelligence; Cost function; Dynamic programming; Educational institutions; Electronic mail; Game theory; Learning systems; Multiagent systems; Nash equilibrium; Stochastic processes; PIA; cooperative; multiagent reinforcement learning; multiple timescales;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

SICE, 2007 Annual Conference

Conference_Location :

Takamatsu

Print_ISBN :

978-4-907764-27-2

Electronic_ISBN :

978-4-907764-27-2

Type :

conf

DOI :

10.1109/SICE.2007.4421462

Filename :

4421462

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2644535