DocumentCode
3313859
Title
Arbitrarily modulated Markov decision processes
Author
Yu, Jia Yuan ; Mannor, Shie
Author_Institution
Dept. of Electr. & Comput. Eng., McGill Univ., Montreal, QC, Canada
fYear
2009
fDate
15-18 Dec. 2009
Firstpage
2946
Lastpage
2953
Abstract
We consider decision-making problems in Markov decision processes where both the rewards and the transition probabilities vary in an arbitrary (e.g., nonstationary) fashion. We propose an online Q-learning style algorithm and give a guarantee on its performance evaluated in retrospect against alternative policies. Unlike previous works, the guarantee depends critically on the variability of the uncertainty in the transition probabilities, but holds regardless of arbitrary changes in rewards and transition probabilities over time. Besides its intrinsic computational efficiency, this approach requires neither prior knowledge nor estimation of the transition probabilities.
Keywords
Markov processes; decision theory; learning (artificial intelligence); performance evaluation; probability; arbitrarily modulated Markov decision processes; intrinsic computational efficiency; online Q-learning style algorithm; performance evaluation; transition probability; Computational efficiency; Control system synthesis; Control systems; Decision making; Game theory; Parameter estimation; Robustness; Sampling methods; Stochastic processes; Uncertainty;
fLanguage
English
Publisher
ieee
Conference_Titel
Decision and Control, 2009 held jointly with the 2009 28th Chinese Control Conference. CDC/CCC 2009. Proceedings of the 48th IEEE Conference on
Conference_Location
Shanghai
ISSN
0191-2216
Print_ISBN
978-1-4244-3871-6
Electronic_ISBN
0191-2216
Type
conf
DOI
10.1109/CDC.2009.5400662
Filename
5400662
Link To Document