مرکز منطقه ای اطلاع رساني علوم و فناوري - The Two Facets of the Exploration-Exploitation Dilemma

DocumentCode :

3103085

Title :

The Two Facets of the Exploration-Exploitation Dilemma

Author :

Zhang, Kaifu ; Pan, Wei

Author_Institution :

Tsinghua Univ., Beijing

fYear :

2006

fDate :

18-22 Dec. 2006

Firstpage :

371

Lastpage :

380

Abstract :

This paper proposes an algorithm to better solve the exploration-exploitation dilemma faced by model-less reinforcement learning agents. The main contribution is twofold: (1) The two facets of the exploration-exploitation dilemma are distinguished: in some cases, the agent faces a non-stationary environment, therefore it needs to choose the best moment to explore in order to adapt to the changes; in some other cases, the agent faces a relatively large state-action space, and it therefore needs to choose the most promising subset of states/actions to explore. In this two-facet framework, we compared the relative advantage and limitations of two previously proposed algorithms in difference situations. (2) We unified these two algorithms to produce the new algorithm which works fairly well in all testing situations.

Keywords :

learning (artificial intelligence); multi-agent systems; exploration-exploitation dilemma; large state-action space; model-less reinforcement learning agent; nonstationary environment; Benchmark testing; Large-scale systems; Learning; Navigation; Orbital robotics; Robot kinematics; Space exploration; State estimation;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Intelligent Agent Technology, 2006. IAT '06. IEEE/WIC/ACM International Conference on

Conference_Location :

Hong Kong

Print_ISBN :

0-7695-2748-5

Type :

conf

DOI :

10.1109/IAT.2006.120

Filename :

4052945

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3103085