DocumentCode :
2717050
Title :
Fitted Q Iteration with CMACs
Author :
Timmer, Stephan ; Riedmiller, Martin
Author_Institution :
Dept. of Comput. Sci., Osnabrueck Univ.
fYear :
2007
fDate :
1-5 April 2007
Firstpage :
1
Lastpage :
8
Abstract :
A major issue in model-free reinforcement learning is how to efficiently exploit the data collected by an exploration strategy. This is especially important in case of continuous, high dimensional state spaces, since it is impossible to explore such spaces exhaustively. A simple but promising approach is to fix the number of state transitions which are sampled from the underlying Markov decision process. For several kernel-based learning algorithms there exist convergence proofs and notable empirical results, if a fixed set of transition instances is used. In this article, we will analyze how function approximators similar to the CMAC-architecture can be combined with this idea. We will show both analytically and empirically the potential power of the CMAC architecture combined with an offline version of Q-learning
Keywords :
Markov processes; cerebellar model arithmetic computers; computer architecture; iterative methods; learning (artificial intelligence); CMAC architecture; Markov decision process; Q-learning; fitted Q iteration; function approximators; kernel-based learning; reinforcement learning; Algorithm design and analysis; Computer science; Convergence; Dynamic programming; Inference algorithms; Interleaved codes; Sampling methods; Space exploration; State-space methods; Supervised learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Approximate Dynamic Programming and Reinforcement Learning, 2007. ADPRL 2007. IEEE International Symposium on
Conference_Location :
Honolulu, HI
Print_ISBN :
1-4244-0706-0
Type :
conf
DOI :
10.1109/ADPRL.2007.368162
Filename :
4220807
Link To Document :
بازگشت