Title :
Convergence accelerated by the improvements of stepsize and gradient in SPSA
Author :
Huajun, Zhang ; Jin, Zhao ; Tao, Geng
Author_Institution :
Dept. of Control Sci. & Eng., Huazhong Univ. of Sci. & Technol. (HUST), Wuhan, China
Abstract :
The simultaneous perturbation stochastic approximation (SPSA) is effective for the optimization problem of complex system which is difficult or impossible to directly obtain the gradient of the objective function except the measurements of objective function. SPSA relies on measurements of the objective function to estimate the gradient efficiently. In order to accelerate the convergence of SPSA, many improvements are proposed. The typical improvement is that the Newton-Raphson gradient approximation approach replaces first order gradient approximation of standard SPSA. Although the second order SPSA (2SPSA) algorithm solves the optimization problem successfully by efficient gradient approximation, the accuracy of the algorithm depends on the matrix conditioning of the objective function Hessian. In order to eliminate the influence caused by the objective function Hessian, this paper uses nonlinear conjugate gradient method to decide the search direction of the objective function. By synthesizing different nonlinear conjugate gradient methods, it ensures each search direction to be descensive. Besides the search direction improvement, this paper also improves the stepsize calculation method of SPSA. It calculates suitable stepsize based on the current and former gradient information. With the descensive search direction and appropriate stepsize, the improved SPSA converges faster than the 2SPSA. Through applying to reinforcement learning, the virtues of the improved SPSA are validated.
Keywords :
Hessian matrices; Newton-Raphson method; approximation theory; convergence of numerical methods; gradient methods; optimisation; Hessian matrix conditioning; Hessian objective function; Newton-Raphson gradient approximation approach; SPSA gradient; SPSA stepsize; nonlinear conjugate gradient method; optimization problem; reinforcement learning; second order SPSA algorithm; simultaneous perturbation stochastic approximation; Acceleration; Approximation algorithms; Approximation methods; Convergence; Genetic algorithms; Gradient methods; Learning; Newton-Raphson; SPSA; conjugate gradient; motion control;
Conference_Titel :
Control and Decision Conference (CCDC), 2011 Chinese
Conference_Location :
Mianyang
Print_ISBN :
978-1-4244-8737-0
DOI :
10.1109/CCDC.2011.5968131