Title :
An asymptotically optimal learning controller for finite Markov chains with unknown transition probabilities
Author :
Sato, Mitsuo ; Abe, Kenichi ; Takeda, Hiroshi
Author_Institution :
Tohoku University, Sendai, Japan
fDate :
11/1/1985 12:00:00 AM
Abstract :
A learning controller is presented for a Markovian decision problem in which the transition probabilities are unknown. This controller, which is designed to be asymptotically optimal with consideration of a conflict between estimation and control, uses a performance criterion incorporating a tradeoff between them explicitly for determination of a control policy. It is shown that this controller achieves asymptotic optimality in the sense that the relative frequency of applying the optimal policy converges to unity.
Keywords :
Decision making; Learning control systems; Markov processes; Optimal stochastic control; Stochastic optimal control; Automatic control; Control systems; Degradation; Differential equations; Optimal control; Power system reliability; Power system stability; Riccati equations; Robust control; Stochastic processes;
Journal_Title :
Automatic Control, IEEE Transactions on
DOI :
10.1109/TAC.1985.1103853