Weighted discounted dynamic programming

Author

Feinberg, Eugene A. ; Shwartz, Adam

Author_Institution

State Univ. of New York, Stony Brook, NY, USA

fYear

1991

fDate

11-13 Dec 1991

Firstpage

485

Abstract

The authors consider a discrete-time Markov decision process with an infinite horizon. They maximize the sum of a number of standard discounted rewards, each with a different discount factor. It is shown that with this criterion for some positive ε there need not exist an ε-optimal stationary strategy, even when the state and action sets are finite. However, ε-strategies exist under weak conditions, ε-optimal Markov strategies are exhibited, which are stationary and some time onward. When both state and action are finite, there exists an optimal Markov strategy with this property. An explicit algorithm for the computation of such strategies is included

Keywords

Markov processes; decision theory; dynamic programming; state-space methods; decision theory; discount factor; discrete-time Markov decision process; epsilon -optimal stationary strategy; state space; weighted discounted dynamic programming; Dynamic programming; Electric variables measurement; Extraterrestrial measurements; History; Infinite horizon; Measurement standards; State-space methods;

fLanguage

English

Publisher

ieee

Conference_Titel

Decision and Control, 1991., Proceedings of the 30th IEEE Conference on

Conference_Location

Brighton

Print_ISBN

0-7803-0450-0

Type

conf

DOI

10.1109/CDC.1991.261350

Filename

261350

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=3470871