• DocumentCode
    589275
  • Title

    Backtracking for More Efficient Large Scale Dynamic Programming

  • Author

    Tripp, C. ; Shachter, R.

  • Author_Institution
    Dept. of Electr. Eng., Stanford Univ., Stanford, CA, USA
  • Volume
    1
  • fYear
    2012
  • fDate
    12-15 Dec. 2012
  • Firstpage
    338
  • Lastpage
    343
  • Abstract
    Reinforcement learning algorithms are widely used to generate policies for complex Markov decision processes. We introduce backtracking, a modification to reinforcement learning algorithms that can significantly improve their performance, particularly for off-line policy generation. Backtracking waits to perform update calculations until the successor´s value has been updated, allowing immediate reuse of update calculations. We demonstrate the effectiveness of backtracking on two benchmark processes using both Q-learning and real-time dynamic programming.
  • Keywords
    Markov processes; decision making; dynamic programming; learning (artificial intelligence); Q-learning; complex Markov decision processes; large scale dynamic programming; offline policy generation; real-time dynamic programming; reinforcement learning algorithms; Dynamic programming; Erbium; Heuristic algorithms; Learning; Mathematical model; Process control; Real-time systems; Q-Learning; backtracking; dynamic programming; experience replay; reinforcement learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Applications (ICMLA), 2012 11th International Conference on
  • Conference_Location
    Boca Raton, FL
  • Print_ISBN
    978-1-4673-4651-1
  • Type

    conf

  • DOI
    10.1109/ICMLA.2012.63
  • Filename
    6406685