Kernel-Based Approximate Dynamic Programming for Real-Time Online Learning Control: An Experimental Study

Author

Xin Xu ; Chuanqiang Lian ; Lei Zuo ; Haibo He

Author_Institution

Inst. of Unmanned Syst., Nat. Univ. of Defense Technol., Changsha, China

Volume

22

Issue

1

fYear

2014

fDate

Jan. 2014

Firstpage

146

Lastpage

156

Abstract

In the past decade, there has been considerable research interest in learning control methods based on reinforcement learning (RL) and approximate dynamic programming (ADP). As an important class of function approximation techniques, kernel methods have been recently applied to improve the generalization ability of RL and ADP methods but most previous works were only based on simulation. This paper focuses on experimental studies of real-time online learning control for nonlinear systems using kernel-based ADP methods. Specifically, the kernel-based dual heuristic programming (KDHP) method is applied and tested on real-time control systems. Two kernel-based online learning control schemes are presented for uncertain nonlinear systems by using simulation data and online sampling data, respectively. Learning control experiments were performed on a single-link inverted pendulum system as well as a double-link inverted pendulum system. From the experimental results, it is shown that both online learning control schemes, either using simulation data or using real sampling data, are effective for approximating near-optimal control policies of nonlinear dynamical systems with model uncertainties. In addition, it is demonstrated that KDHP can achieve better performance than conventional DHP, which uses multilayer perceptron neural networks.

Keywords

Markov processes; dynamic programming; function approximation; generalisation (artificial intelligence); learning (artificial intelligence); learning systems; nonlinear control systems; optimal control; pendulums; sampling methods; uncertain systems; KDHP method; RL; double-link inverted pendulum system; function approximation techniques; generalization ability; kernel-based ADP methods; kernel-based approximate dynamic programming; kernel-based dual heuristic programming; learning control methods; model uncertainties; multilayer perceptron neural networks; near-optimal control policies; realtime online learning control; reinforcement learning; sampling data; single-link inverted pendulum system; uncertain nonlinear systems; Approximate dynamic programming (ADP); Markov decision processes (MDPs); learning control; online learning; reinforcement learning (RL);

fLanguage

English

Journal_Title

Control Systems Technology, IEEE Transactions on

Publisher

ieee

ISSN

1063-6536

Type

jour

DOI

10.1109/TCST.2013.2246866

Filename

6492103