مرکز منطقه ای اطلاع رساني علوم و فناوري

DocumentCode :

3433937

Title :

TD-learning with exploration

Author :

Meyn, Sean P. ; Surana, Amit

Author_Institution :

Department of Electrical and Computer Engineering and the Coordinated Science Laboratory at UIUC, USA

fYear :

2011

fDate :

12-15 Dec. 2011

Firstpage :

148

Lastpage :

155

Abstract :

We introduce exploration in the TD-learning algorithm to approximate the value function for a given policy. In this way we can modify the norm used for approximation, “zooming in” to a region of interest in the state space. We also provide extensions to SARSA to eliminate the need for numerical integration in policy improvement. Construction of the algorithm and its analysis build on recent general results concerning the spectral theory of Markov chains and positive operators.

Keywords :

Approximation algorithms; Equations; Function approximation; Linear approximation; Markov processes; Mathematical model;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Decision and Control and European Control Conference (CDC-ECC), 2011 50th IEEE Conference on

Conference_Location :

Orlando, FL, USA

ISSN :

0743-1546

Print_ISBN :

978-1-61284-800-6

Electronic_ISBN :

0743-1546

Type :

conf

DOI :

10.1109/CDC.2011.6160851

Filename :

6160851

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3433937