DocumentCode :
848070
Title :
An asymptotically optimal learning controller for finite Markov chains with unknown transition probabilities
Author :
Sato, Mitsuo ; Abe, Kenichi ; Takeda, Hiroshi
Author_Institution :
Tohoku University, Sendai, Japan
Volume :
30
Issue :
11
fYear :
1985
fDate :
11/1/1985 12:00:00 AM
Firstpage :
1147
Lastpage :
1149
Abstract :
A learning controller is presented for a Markovian decision problem in which the transition probabilities are unknown. This controller, which is designed to be asymptotically optimal with consideration of a conflict between estimation and control, uses a performance criterion incorporating a tradeoff between them explicitly for determination of a control policy. It is shown that this controller achieves asymptotic optimality in the sense that the relative frequency of applying the optimal policy converges to unity.
Keywords :
Decision making; Learning control systems; Markov processes; Optimal stochastic control; Stochastic optimal control; Automatic control; Control systems; Degradation; Differential equations; Optimal control; Power system reliability; Power system stability; Riccati equations; Robust control; Stochastic processes;
fLanguage :
English
Journal_Title :
Automatic Control, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9286
Type :
jour
DOI :
10.1109/TAC.1985.1103853
Filename :
1103853
Link To Document :
بازگشت