Author_Institution :
Coll. of Mechatron. & Autom., Nat. Univ. of Defense Technol., Changsha, China
Abstract :
In the past decade, adaptive critic designs (ACDs), including heuristic dynamic programming (HDP), dual heuristic programming (DHP), and their action-dependent ones, have been widely studied to realize online learning control of dynamical systems. However, because neural networks with manually designed features are commonly used to deal with continuous state and action spaces, the generalization capability and learning efficiency of previous ACDs still need to be improved. In this paper, a novel framework of ACDs with sparse kernel machines is presented by integrating kernel methods into the critic of ACDs. To improve the generalization capability as well as the computational efficiency of kernel machines, a sparsification method based on the approximately linear dependence analysis is used. Using the sparse kernel machines, two kernel-based ACD algorithms, that is, kernel HDP (KHDP) and kernel DHP (KDHP), are proposed and their performance is analyzed both theoretically and empirically. Because of the representation learning and generalization capability of sparse kernel machines, KHDP and KDHP can obtain much better performance than previous HDP and DHP with manually designed neural networks. Simulation and experimental results of two nonlinear control problems, that is, a continuous-action inverted pendulum problem and a ball and plate control problem, demonstrate the effectiveness of the proposed kernel ACD methods.
Keywords :
adaptive control; approximation theory; dynamic programming; generalisation (artificial intelligence); heuristic programming; learning (artificial intelligence); learning systems; nonlinear control systems; sparse matrices; KDHP; KHDP; adaptive critic designs; approximately linear dependence analysis; ball-and-plate control problem; computational efficiency improvement; continuous action space; continuous state space; continuous-action inverted pendulum problem; dual-heuristic programming; dynamical system online learning control; generalization capability improvement; heuristic dynamic programming; kernel DHP; kernel HDP; kernel-based ACD algorithms; learning efficiency improvement; nonlinear control problems; representation learning; sparse kernel machines; sparsification method; Approximation algorithms; Approximation methods; Dictionaries; Kernel; Learning systems; Machine learning; Vectors; Adaptive critic designs; Markov decision processes; approximate dynamic programming; kernel machines; learning control; reinforcement learning;