مرکز منطقه ای اطلاع رساني علوم و فناوري - Basis function adaptation methods for cost approximation in MDP

DocumentCode :

493366

Title :

Basis function adaptation methods for cost approximation in MDP

Author :

Yu, Huizhen ; Bertsekas, Dimitri P.

Author_Institution :

Dept. of Comput. Sci., Univ. of Helsinki, Helsinki

fYear :

2009

fDate :

March 30 2009-April 2 2009

Firstpage :

Lastpage :

Abstract :

We generalize a basis adaptation method for cost approximation in Markov decision processes (MDP), extending earlier work of Menache, Mannor, and Shimkin. In our context, basis functions are parametrized and their parameters are tuned by minimizing an objective function involving the cost function approximation obtained when a temporal differences (TD) or other method is used. The adaptation scheme involves only low order calculations and can be implemented in a way analogous to policy gradient methods. In the generalized basis adaptation framework we provide extensions to TD methods for nonlinear optimal stopping problems and to alternative cost approximations beyond those based on TD.

Keywords :

Markov processes; approximation theory; decision theory; gradient methods; minimisation; nonlinear programming; MDP; Markov decision process; basis function adaptation method; cost function approximation; gradient method; nonlinear optimal stopping problem; objective function minimization; temporal difference; Computer science; Cost function; Design optimization; Differential equations; Function approximation; Gradient methods; Laboratories; Optimization methods; Process design; Vectors;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Adaptive Dynamic Programming and Reinforcement Learning, 2009. ADPRL '09. IEEE Symposium on

Conference_Location :

Nashville, TN

Print_ISBN :

978-1-4244-2761-1

Type :

conf

DOI :

10.1109/ADPRL.2009.4927528

Filename :

4927528

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=493366