• DocumentCode
    3313859
  • Title

    Arbitrarily modulated Markov decision processes

  • Author

    Yu, Jia Yuan ; Mannor, Shie

  • Author_Institution
    Dept. of Electr. & Comput. Eng., McGill Univ., Montreal, QC, Canada
  • fYear
    2009
  • fDate
    15-18 Dec. 2009
  • Firstpage
    2946
  • Lastpage
    2953
  • Abstract
    We consider decision-making problems in Markov decision processes where both the rewards and the transition probabilities vary in an arbitrary (e.g., nonstationary) fashion. We propose an online Q-learning style algorithm and give a guarantee on its performance evaluated in retrospect against alternative policies. Unlike previous works, the guarantee depends critically on the variability of the uncertainty in the transition probabilities, but holds regardless of arbitrary changes in rewards and transition probabilities over time. Besides its intrinsic computational efficiency, this approach requires neither prior knowledge nor estimation of the transition probabilities.
  • Keywords
    Markov processes; decision theory; learning (artificial intelligence); performance evaluation; probability; arbitrarily modulated Markov decision processes; intrinsic computational efficiency; online Q-learning style algorithm; performance evaluation; transition probability; Computational efficiency; Control system synthesis; Control systems; Decision making; Game theory; Parameter estimation; Robustness; Sampling methods; Stochastic processes; Uncertainty;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Decision and Control, 2009 held jointly with the 2009 28th Chinese Control Conference. CDC/CCC 2009. Proceedings of the 48th IEEE Conference on
  • Conference_Location
    Shanghai
  • ISSN
    0191-2216
  • Print_ISBN
    978-1-4244-3871-6
  • Electronic_ISBN
    0191-2216
  • Type

    conf

  • DOI
    10.1109/CDC.2009.5400662
  • Filename
    5400662