• DocumentCode
    646403
  • Title

    Iterated approximate value functions

  • Author

    O´Donoghue, Brendan ; Yang Wang ; Boyd, Stephen

  • fYear
    2013
  • fDate
    17-19 July 2013
  • Firstpage
    3882
  • Lastpage
    3888
  • Abstract
    In this paper we introduce a control policy which we refer to as the iterated approximate value function policy. The generation of this policy requires two stages, the first one carried out off-line, and the second stage carried out on-line. In the first stage we simultaneously compute a trajectory of moments of the state and action and a sequence of approximate value functions optimized to that trajectory. The next stage is to perform control using the generated sequence of approximate value functions. This yields a time-varying policy, even in the case where the optimal policy is time-invariant. We restrict our attention to the case with linear dynamics and quadratically representable stage cost function. In this case the pre-computation stage requires the solution of a semidefinite program (SDP). Finding the control action at each time-period requires solving a small convex optimization problem which can be carried out quickly. We conclude with some examples.
  • Keywords
    convex programming; iterative methods; optimal control; stochastic systems; SDP; infinite horizon discounted stochastic control problem; iterated approximate value function policy; iterated approximate value functions; linear dynamics; optimal control policy; semidefinite programming; small convex optimization problem; stage cost function; time-invariant system; time-varying policy; Approximation methods; Convex functions; Dynamic programming; Noise; Optimization; Trajectory; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Control Conference (ECC), 2013 European
  • Conference_Location
    Zurich
  • Type

    conf

  • Filename
    6669813