مرکز منطقه ای اطلاع رساني علوم و فناوري - Policy-Gradient Based Actor-Critic Algorithms

DocumentCode :

498278

Title :

Policy-Gradient Based Actor-Critic Algorithms

Author :

Awate, Yogesh P.

Author_Institution :

marketRx - A Cognizant Co., Gurgaon, India

Volume :

fYear :

2009

fDate :

19-21 May 2009

Firstpage :

505

Lastpage :

509

Abstract :

We consider the framework of a set of recently proposed two-timescale actor-critic algorithms for reinforcement-learning using the long-run average-reward criterion and linear feature-based value-function approximation. The actor update is based on the stochastic policy-gradient ascent rule. We derive a stochastic-gradient-based novel critic update to minimize the variance of the policy-gradient estimator used in the actor update. We propose a novel baseline structure for variance minimization of an estimator and derive an optimal baseline which makes the covariance matrix a zero matrix - the best achievable. We derive a novel actor update based on the optimal baseline deduced for an existing algorithm. We derive another novel actor update using the optimal baseline for an unbiased policy-gradient estimator which we deduce from the Policy-Gradient Theorem with Function Approximation. The computational results demonstrate that the proposed algorithms outperform the state-of-the-art on Garnet problems.

Keywords :

approximation theory; covariance matrices; gradient methods; learning (artificial intelligence); stochastic processes; Garnet problems; average-reward criterion; covariance matrix; linear feature-based value-function approximation; policy-gradient based actor-critic algorithms; reinforcement learning; stochastic policy-gradient ascent rule; zero matrix; Approximation algorithms; Convergence; Covariance matrix; Function approximation; Garnets; Intelligent systems; Learning; Linear approximation; State-space methods; Stochastic processes; actor-critic algorithms; policy-gradient methods; reinforcement learning;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Intelligent Systems, 2009. GCIS '09. WRI Global Congress on

Conference_Location :

Xiamen

Print_ISBN :

978-0-7695-3571-5

Type :

conf

DOI :

10.1109/GCIS.2009.372

Filename :

5209101

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=498278