Reinforcement Learning is Direct Adaptive Optimal Control

Author

Sutton, Richard S. ; Barto, Andrew G. ; Williams, Ronald J.

Author_Institution

GTE Laboratories Inc., Waltham, MA 02254

fYear

1991

fDate

26-28 June 1991

Firstpage

2143

Lastpage

2146

Abstract

Control problems can be divided into two classes: 1) regulation and tracking problems, in which the objective is to follow a reference trajectory, and 2) optimal control problems, in which the objective is to extremize a functional of the controlled system´s behavior that is not necessarily defined in terms of a reference trajectory. Adaptive methods for problems of the first kind are well known, and include self-tuning regulators and model-reference methods, whereas adaptive methods for optimal-control problems have received relatively little attention. Moreover, the adaptive optimal-control methods that have been studied are almost all indirect methods, in which controls are recomputed from an estimated system model at each step. This computation is inherently complex, making adaptive methods in which the optimal controls are estimated directly more attractive. Here we present reinforcement learning methods as a computationally simple, direct approach to the adaptive optimal control of nonlinear systems.

Keywords

Adaptive control; Control system synthesis; Control systems; Learning; Legged locomotion; Nonlinear systems; Optimal control; Programmable control; Robust control; Trajectory;

fLanguage

English

Publisher

ieee

Conference_Titel

American Control Conference, 1991

Conference_Location

Boston, MA, USA

Print_ISBN

0-87942-565-2

Type

conf

Filename

4791776