DocumentCode :
2766215
Title :
Opposition-Based Q(λ) Algorithm
Author :
Shokri, M. ; Tizhoosh, H.R. ; Kamel, M.
Author_Institution :
Waterloo Univ., Waterloo
fYear :
2006
fDate :
16-21 July 2006
Firstpage :
254
Lastpage :
261
Abstract :
The problem of delayed reward in reinforcement learning is usually tackled by implementing the mechanism of eligibility traces. In this paper we introduce an extension of eligibility traces to solve one of the challenging problems in reinforcement learning. The concept of opposition traces is proposed in this work to deal with large state space problems in reinforcement learning applications. We combine the idea of opposition and eligibility traces to construct the opposition-based Q(lambda). The results are compared with the conventional Watkins´ Q(lambda) and reflect a remarkable performance increase.
Keywords :
learning (artificial intelligence); delayed reward; eligibility traces; large state space; opposition traces; opposition-based algorithm; reinforcement learning; Acceleration; Delay; Design engineering; Laboratories; Learning; Machine intelligence; Neural networks; Pattern analysis; System analysis and design; Systems engineering and theory;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks, 2006. IJCNN '06. International Joint Conference on
Conference_Location :
Vancouver, BC
Print_ISBN :
0-7803-9490-9
Type :
conf
DOI :
10.1109/IJCNN.2006.246689
Filename :
1716100
Link To Document :
بازگشت