مرکز منطقه ای اطلاع رساني علوم و فناوري - Infinite-Horizon Policy-Gradient Estimation with Variable Discount Factor for Markov Decision Process

DocumentCode :

2641763

Title :

Infinite-Horizon Policy-Gradient Estimation with Variable Discount Factor for Markov Decision Process

Author :

Bao, Bing-Kun ; Yin, Bao-Qun ; Xi, Hong-sheng

Author_Institution :

Dept. of Autom., China Univ. of Sci. & Technol., Hefei

fYear :

2008

fDate :

18-20 June 2008

Firstpage :

584

Lastpage :

584

Abstract :

A novel infinite-horizon policy-gradient estimation method with variable discount factor is proposed in this paper. This method tackles the normal policy-gradient estimation methods´ limitations on unbalance of the bias and variance by using an incremental sequence as the discount factor. Numerical experiments conducted on the Markov decision process have shown its effectiveness.

Keywords :

Markov processes; decision theory; gradient methods; infinite horizon; Markov decision process; incremental sequence; infinite-horizon policy-gradient estimation; variable discount factor; Approximation algorithms; Automation; Computational modeling; Eigenvalues and eigenfunctions; Optimization methods; State estimation; State-space methods; Stochastic processes;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Innovative Computing Information and Control, 2008. ICICIC '08. 3rd International Conference on

Conference_Location :

Dalian, Liaoning

Print_ISBN :

978-0-7695-3161-8

Electronic_ISBN :

978-0-7695-3161-8

Type :

conf

DOI :

10.1109/ICICIC.2008.318

Filename :

4603773

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2641763