مرکز منطقه ای اطلاع رساني علوم و فناوري - Q-learning algorithms for optimal stopping based on least squares

DocumentCode :

2160328

Title :

Q-learning algorithms for optimal stopping based on least squares

Author :

Huizhen Yu ; Bertsekas, Dimitri P.

Author_Institution :

Helsinki Inst. for Inf. Technol., Univ. of Helsinki, Helsinki, Finland

fYear :

2007

fDate :

2-5 July 2007

Firstpage :

2368

Lastpage :

2375

Abstract :

We consider the solution of discounted optimal stopping problems using linear function approximation methods. A Q-learning algorithm for such problems, proposed by Tsitsiklis and Van Roy, is based on the method of temporal differences and stochastic approximation. We propose alternative algorithms, which are based on projected value iteration ideas and least squares. We prove the convergence of some of these algorithms and discuss their properties.

Keywords :

Markov processes; approximation theory; decision theory; dynamic programming; iterative methods; learning (artificial intelligence); least squares approximations; pricing; DP; Markovian decision problem; Q-learning algorithm; dynamic programming; financial derivative pricing; least squares; linear function approximation method; optimal stopping problem; stochastic approximation method; temporal difference method; value iteration; Approximation algorithms; Convergence; Equations; Least squares approximations; Q-factor; Vectors;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Control Conference (ECC), 2007 European

Conference_Location :

Kos

Print_ISBN :

978-3-9524173-8-6

Type :

conf

Filename :

7068523

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2160328