Threshold learning in the improved penalty avoiding rational policy making algorithm

Author

Miyazaki, Kazuteru ; Kobayashi, Ryouhei ; Kobayashi, Hiroaki

Author_Institution

Dept. of Assessment & Res. for Degree Awarding, Univ. Evaluation, Tokyo, Japan

fYear

2010

fDate

18-21 Aug. 2010

Firstpage

3240

Lastpage

3245

Abstract

The penalty avoiding rational policy making algorithm (PARP) previously improved to save memory and cope with uncertainty, i.e., Improved PARP (IPARP). The efficiency of IPARP is influenced by threshold of a penalty rule or a penalty basis function γ significantly. In this paper, we propose a technique for learning γ. We show the effectiveness of our proposal using a soccer game task called “Keepaway”.

Keywords

game theory; learning (artificial intelligence); PARP; keepaway; penalty avoiding rational policy making algorithm; soccer game; threshold learning; Function approximation; Games; Machine learning; Memory management; Proposals; Tiles; Uncertainty; Exploitation-oriented Learning XoL; Improved PARP; Keepaway Task; Reinforcement Learning; Threshold Learning;

fLanguage

English

Publisher

ieee

Conference_Titel

SICE Annual Conference 2010, Proceedings of

Conference_Location

Taipei

Print_ISBN

978-1-4244-7642-8

Type

conf

Filename

5602796

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=529477