مرکز منطقه ای اطلاع رساني علوم و فناوري

DocumentCode :

2717050

Title :

Fitted Q Iteration with CMACs

Author :

Timmer, Stephan ; Riedmiller, Martin

Author_Institution :

Dept. of Comput. Sci., Osnabrueck Univ.

fYear :

2007

fDate :

1-5 April 2007

Firstpage :

Lastpage :

Abstract :

A major issue in model-free reinforcement learning is how to efficiently exploit the data collected by an exploration strategy. This is especially important in case of continuous, high dimensional state spaces, since it is impossible to explore such spaces exhaustively. A simple but promising approach is to fix the number of state transitions which are sampled from the underlying Markov decision process. For several kernel-based learning algorithms there exist convergence proofs and notable empirical results, if a fixed set of transition instances is used. In this article, we will analyze how function approximators similar to the CMAC-architecture can be combined with this idea. We will show both analytically and empirically the potential power of the CMAC architecture combined with an offline version of Q-learning

Keywords :

Markov processes; cerebellar model arithmetic computers; computer architecture; iterative methods; learning (artificial intelligence); CMAC architecture; Markov decision process; Q-learning; fitted Q iteration; function approximators; kernel-based learning; reinforcement learning; Algorithm design and analysis; Computer science; Convergence; Dynamic programming; Inference algorithms; Interleaved codes; Sampling methods; Space exploration; State-space methods; Supervised learning;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Approximate Dynamic Programming and Reinforcement Learning, 2007. ADPRL 2007. IEEE International Symposium on

Conference_Location :

Honolulu, HI

Print_ISBN :

1-4244-0706-0

Type :

conf

DOI :

10.1109/ADPRL.2007.368162

Filename :

4220807

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2717050