مرکز منطقه ای اطلاع رساني علوم و فناوري

DocumentCode :

646403

Title :

Iterated approximate value functions

Author :

O´Donoghue, Brendan ; Yang Wang ; Boyd, Stephen

fYear :

2013

fDate :

17-19 July 2013

Firstpage :

3882

Lastpage :

3888

Abstract :

In this paper we introduce a control policy which we refer to as the iterated approximate value function policy. The generation of this policy requires two stages, the first one carried out off-line, and the second stage carried out on-line. In the first stage we simultaneously compute a trajectory of moments of the state and action and a sequence of approximate value functions optimized to that trajectory. The next stage is to perform control using the generated sequence of approximate value functions. This yields a time-varying policy, even in the case where the optimal policy is time-invariant. We restrict our attention to the case with linear dynamics and quadratically representable stage cost function. In this case the pre-computation stage requires the solution of a semidefinite program (SDP). Finding the control action at each time-period requires solving a small convex optimization problem which can be carried out quickly. We conclude with some examples.

Keywords :

convex programming; iterative methods; optimal control; stochastic systems; SDP; infinite horizon discounted stochastic control problem; iterated approximate value function policy; iterated approximate value functions; linear dynamics; optimal control policy; semidefinite programming; small convex optimization problem; stage cost function; time-invariant system; time-varying policy; Approximation methods; Convex functions; Dynamic programming; Noise; Optimization; Trajectory; Vectors;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Control Conference (ECC), 2013 European

Conference_Location :

Zurich

Type :

conf

Filename :

6669813

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=646403