Title :
Learning control of finite Markov chains with periodically varying transition probabilities
Author :
Sato, Flitsuo ; Takeda, Hiroshi
Author_Institution :
Dept. of Electr. Eng., Tohoku Univ., Sendai, Japan
Abstract :
A learning scheme is presented for a Markovian decision problem with estimation of unknown transition probabilities which are dominated by a periodically varying parameter with period T. According to this scheme, at every T time instant the unknown parameter is estimated and then a policy sequence to be applied at the next T time instant is determined. It is shown that the estimate converges to the true value as time evolves and accordingly this scheme asymptotically attains control which is β-optimal in a broad sense
Keywords :
Markov processes; adaptive control; learning systems; optimal control; stochastic systems; β-optimal control; Markovian decision problem; finite Markov chains; periodically varying transition probabilities; Control systems; Equations; Humans; Optimal control; Parameter estimation; Stochastic systems;
Conference_Titel :
Decision and Control, 1990., Proceedings of the 29th IEEE Conference on
Conference_Location :
Honolulu, HI
DOI :
10.1109/CDC.1990.203852