مرکز منطقه ای اطلاع رساني علوم و فناوري - Proposal of a propagation algorithm of the Expected Failure Probability and the effectiveness on multi-agent environments

DocumentCode :

681073

Title :

Proposal of a propagation algorithm of the Expected Failure Probability and the effectiveness on multi-agent environments

Author :

Miyazaki, Kazuteru ; Muraoka, Hiroki ; Kobayashi, Hiroaki

Author_Institution :

Research Department, National Institution for Academic Degrees and University Evaluation, Tokyo, Japan

fYear :

2013

fDate :

14-17 Sept. 2013

Firstpage :

1067

Lastpage :

1072

Abstract :

The Improved Penalty Avoiding Rational Policy Making algorithm (IPARP) that can learn by a reward and a penalty. IPARP aims to find penalty rules that have a high possibility to receive a penalty. Though IPARP is effective in many cases, it needs many trial-and-error searches due to memory constraints. In this paper, a propagation algorithm of the Expected Failure Probability (EFP) is proposed to speed it up. Furthermore, it is extended to multi-agent environments. In a multi-agent learning, it is important to avoid concurrent learning problem [1] that occurs when multiple agents learn concurrently. Hence two methods are proposed to avoid the problem and confirm their effectiveness by numerical experiments.

Keywords :

Boltzmann distribution; Educational institutions; Electronic mail; Learning (artificial intelligence); Least squares methods; Memory management; Proposals; Exploitation-oriented Learning; Multi-agent learning; Reinforcement Learning; concurrent learning problem;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

SICE Annual Conference (SICE), 2013 Proceedings of

Conference_Location :

Nagoya, Japan

Type :

conf

Filename :

6736240

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=681073