An asymptotically efficient algorithm for finite horizon stochastic dynamic programming problems

Author

Chang, Hyeong Soo ; Fu, Michael C. ; Marcus, Steven I.

Author_Institution

Dept. of Comput. Sci. & Eng., Sogang Univ., Seoul, South Korea

Volume

4

fYear

2003

fDate

9-12 Dec. 2003

Firstpage

3818

Abstract

We present a novel algorithm, called "Simulated Annealing Multiplicative Weights", for approximately solving large (discrete-time) finite-horizon stochastic dynamic programming problems. The algorithm is "asymptotically efficient" in the sense that a finite-time bound for the sample mean of the optimal value function over a given finite policy space can be obtained, and the bound approaches the optimal value as the number of iterations increases. The algorithm updates a probability distribution over the given policy space with a very simple rule, and the sequence of distributions generated by the algorithm converges to a distribution concentrated only on the optimal policies for the given policy space. We also discuss how to reduce the computational cost of the algorithm to apply it in practice.

Keywords

discrete time systems; dynamic programming; infinite horizon; probability; simulated annealing; stochastic programming; asymptotically efficient algorithm; computational cost reduction; discrete time system; finite horizon; finite policy space; finite time bound; iterations; optimal policies; optimal value function; probability distribution; simulated annealing multiplicative weights; stochastic dynamic programming; Annealing; Computational efficiency; Computer science; Control systems; Dynamic programming; Educational institutions; Probability distribution; Random variables; Stochastic processes; Uncertainty;

fLanguage

English

Publisher

ieee

Conference_Titel

Decision and Control, 2003. Proceedings. 42nd IEEE Conference on

ISSN

0191-2216

Print_ISBN

0-7803-7924-1

Type

conf

DOI

10.1109/CDC.2003.1271744

Filename

1271744