Title :
Exponential moving average Q-learning algorithm
Author :
Awheda, Mostafa D. ; Schwartz, Howard M.
Author_Institution :
Dept. of Syst. & Comput. Eng., Carleton Univ., Ottawa, ON, Canada
Abstract :
A multi-agent policy iteration learning algorithm is proposed in this work. The Exponential Moving Average (EMA) mechanism is used to update the policy for a Q-learning agent so that it converges to an optimal policy against the policies of the other agents. The proposed EMA Q-learning algorithm is examined on a variety of matrix and stochastic games. Simulation results show that the proposed algorithm converges in a wider variety of situations than state-of-the-art multi-agent reinforcement learning (MARL) algorithms.
Keywords :
iterative methods; learning (artificial intelligence); matrix algebra; moving average processes; multi-agent systems; stochastic games; EMA Q-learning algorithm; EMA mechanism; MARL algorithms; Q-learning agent; exponential moving average Q-learning algorithm; multiagent policy iteration learning algorithm; multiagent reinforcement learning algorithms; optimal policy; stochastic games; Games; Heuristic algorithms; Learning (artificial intelligence); Markov processes; Nash equilibrium; Probability distribution; Vectors;
Conference_Titel :
Adaptive Dynamic Programming And Reinforcement Learning (ADPRL), 2013 IEEE Symposium on
Conference_Location :
Singapore
DOI :
10.1109/ADPRL.2013.6614986