DocumentCode :
1799309
Title :
Information-theoretic stochastic optimal control via incremental sampling-based algorithms
Author :
Arslan, Oktay ; Theodorou, Evangelos A. ; Tsiotras, Panagiotis
Author_Institution :
Daniel Guggenheim Sch. of Aerosp. Eng., Georgia Inst. of Technol., Atlanta, GA, USA
fYear :
2014
fDate :
9-12 Dec. 2014
Firstpage :
1
Lastpage :
8
Abstract :
This paper considers optimal control of dynamical systems which are represented by nonlinear stochastic differential equations. It is well-known that the optimal control policy for this problem can be obtained as a function of a value function that satisfies a nonlinear partial differential equation, namely, the Hamilton-Jacobi-Bellman equation. This nonlinear PDE must be solved backwards in time, and this computation is intractable for large scale systems. Under certain assumptions, and after applying a logarithmic transformation, an alternative characterization of the optimal policy can be given in terms of a path integral. Path Integral (PI) based control methods have recently been shown to provide elegant solutions to a broad class of stochastic optimal control problems. One of the implementation challenges with this formalism is the computation of the expectation of a cost functional over the trajectories of the unforced dynamics. Computing such expectation over trajectories that are sampled uniformly may induce numerical instabilities due to the exponentiation of the cost. Therefore, sampling of low-cost trajectories is essential for the practical implementation of PI-based methods. In this paper, we use incremental sampling-based algorithms to sample useful trajectories from the unforced system dynamics, and make a novel connection between Rapidly-exploring Random Trees (RRTs) and information-theoretic stochastic optimal control. We show the results from the numerical implementation of the proposed approach to several examples.
Keywords :
information theory; large-scale systems; nonlinear differential equations; numerical stability; optimal control; partial differential equations; sampling methods; stochastic systems; trees (mathematics); Hamilton-Jacobi-Bellman equation; PI based control method; PI-based method; RRT; alternative characterization; cost exponentiation; cost functional; dynamical system; incremental sampling-based algorithm; information-theoretic stochastic optimal control; large scale system; logarithmic transformation; low-cost trajectory; nonlinear PDE; nonlinear partial differential equation; nonlinear stochastic differential equation; numerical implementation; numerical instability; optimal control policy; optimal policy; path integral; rapidly-exploring random trees; stochastic optimal control problem; unforced dynamics; unforced system dynamics; Aerospace electronics; Entropy; Equations; Heuristic algorithms; Noise; Optimal control; Trajectory; path integral; sampling-based algorithms; stochastic optimal control;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), 2014 IEEE Symposium on
Conference_Location :
Orlando, FL
Type :
conf
DOI :
10.1109/ADPRL.2014.7010617
Filename :
7010617
Link To Document :
بازگشت