DocumentCode :
493368
Title :
Iterative local dynamic programming
Author :
Todorov, Emanuel ; Tassa, Yuval
Author_Institution :
Dept. of Cognitive Sci., Univ. of California San Diego, San Diego, CA
fYear :
2009
fDate :
March 30 2009-April 2 2009
Firstpage :
90
Lastpage :
95
Abstract :
We develop an iterative local dynamic programming method (iLDP) applicable to stochastic optimal control problems in continuous high-dimensional state and action spaces. Such problems are common in the control of biological movement, but cannot be handled by existing methods. iLDP can be considered a generalization of differential dynamic programming, in as much as: (a) we use general basis functions rather than quadratics to approximate the optimal value function; (b) we introduce a collocation method that dispenses with explicit differentiation of the cost and dynamics and ties iLDP to the unscented Kalman filter; (c) we adapt the local function approximator to the propagated state covariance, thus increasing accuracy at more likely states. Convergence is similar to quasi-Newton methods. We illustrate iLDP on several problems including the ldquoswimmerrdquo dynamical system which has 14 state and 4 control variables.
Keywords :
Kalman filters; Newton method; covariance analysis; dynamic programming; optimal control; stochastic systems; action spaces; collocation method; continuous high-dimensional state; differential dynamic programming; explicit differentiation; iterative local dynamic programming; local function approximator; optimal value function; quasi-Newton methods; state covariance; stochastic optimal control problems; swimmer dynamical system; unscented Kalman filter; Control systems; Costs; Dynamic programming; Function approximation; Iterative methods; Learning; Open loop systems; Optimal control; Stochastic processes; Stochastic resonance;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Adaptive Dynamic Programming and Reinforcement Learning, 2009. ADPRL '09. IEEE Symposium on
Conference_Location :
Nashville, TN
Print_ISBN :
978-1-4244-2761-1
Type :
conf
DOI :
10.1109/ADPRL.2009.4927530
Filename :
4927530
Link To Document :
بازگشت