مرکز منطقه ای اطلاع رساني علوم و فناوري

DocumentCode :

2766215

Title :

Opposition-Based Q(λ) Algorithm

Author :

Shokri, M. ; Tizhoosh, H.R. ; Kamel, M.

Author_Institution :

Waterloo Univ., Waterloo

fYear :

2006

fDate :

16-21 July 2006

Firstpage :

254

Lastpage :

261

Abstract :

The problem of delayed reward in reinforcement learning is usually tackled by implementing the mechanism of eligibility traces. In this paper we introduce an extension of eligibility traces to solve one of the challenging problems in reinforcement learning. The concept of opposition traces is proposed in this work to deal with large state space problems in reinforcement learning applications. We combine the idea of opposition and eligibility traces to construct the opposition-based Q(lambda). The results are compared with the conventional Watkins´ Q(lambda) and reflect a remarkable performance increase.

Keywords :

learning (artificial intelligence); delayed reward; eligibility traces; large state space; opposition traces; opposition-based algorithm; reinforcement learning; Acceleration; Delay; Design engineering; Laboratories; Learning; Machine intelligence; Neural networks; Pattern analysis; System analysis and design; Systems engineering and theory;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Neural Networks, 2006. IJCNN '06. International Joint Conference on

Conference_Location :

Vancouver, BC

Print_ISBN :

0-7803-9490-9

Type :

conf

DOI :

10.1109/IJCNN.2006.246689

Filename :

1716100

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2766215