مرکز منطقه ای اطلاع رساني علوم و فناوري - A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients

DocumentCode :

3855386

Title :

A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients

Author :

Ivo Grondman;Lucian Busoniu;Gabriel A. D. Lopes;Robert Babuska

Author_Institution :

Delft Center for Systems and Control , Delft University of Technology, The Netherlands

Volume :

Issue :

fYear :

2012

Firstpage :

1291

Lastpage :

1307

Abstract :

Policy-gradient-based actor-critic algorithms are amongst the most popular algorithms in the reinforcement learning framework. Their advantage of being able to search for optimal policies using low-variance gradient estimates has made them useful in several real-life applications, such as robotics, power control, and finance. Although general surveys on reinforcement learning techniques already exist, no survey is specifically dedicated to actor-critic algorithms in particular. This paper, therefore, describes the state of the art of actor-critic algorithms, with a focus on methods that can work in an online setting and use function approximation in order to deal with continuous state and action spaces. After starting with a discussion on the concepts of reinforcement learning and the origins of actor-critic algorithms, this paper describes the workings of the natural gradient, which has made its way into many actor-critic algorithms over the past few years. A review of several standard and natural actor-critic algorithms is given, and the paper concludes with an overview of application areas and a discussion on open issues.

Keywords :

"Approximation methods","Equations","Approximation algorithms","Standards","Optimization","Convergence"

Journal_Title :

IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews)

Publisher :

ieee

ISSN :

1094-6977

Type :

jour

DOI :

10.1109/TSMCC.2012.2218595

Filename :

6392457

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3855386