Title :
Reinforcement learning in the game of Othello: Learning against a fixed opponent and learning from self-play
Author :
van der Ree, Michiel ; Wiering, Marco
Author_Institution :
Inst. of Artificial Intell. & Cognitive Eng., Univ. of Groningen, Groningen, Netherlands
Abstract :
This paper compares three strategies in using reinforcement learning algorithms to let an artificial agent learn to play the game of Othello. The three strategies that are compared are: Learning by self-play, learning from playing against a fixed opponent, and learning from playing against a fixed opponent while learning from the opponent´s moves as well. These issues are considered for the algorithms Q-learning, Sarsa and TD-learning. These three reinforcement learning algorithms are combined with multi-layer perceptrons and trained and tested against three fixed opponents. It is found that the best strategy of learning differs per algorithm. Q-learning and Sarsa perform best when trained against the fixed opponent they are also tested against, whereas TD-learning performs best when trained through self-play. Surprisingly, Q-learning and Sarsa outperform TD-learning against the stronger fixed opponents, when all methods use their best strategy. Learning from the opponent´s moves as well leads to worse results compared to learning only from the learning agent´s own moves.
Keywords :
computer games; game theory; learning (artificial intelligence); multi-agent systems; multilayer perceptrons; Othello game; Q-learning algorithm; Sarsa algorithm; TD-learning algorithm; artificial agent; multilayer perceptrons; reinforcement learning algorithms; self-play learning; Artificial neural networks; Games; Heuristic algorithms; Learning (artificial intelligence); Testing; Training;
Conference_Titel :
Adaptive Dynamic Programming And Reinforcement Learning (ADPRL), 2013 IEEE Symposium on
Conference_Location :
Singapore
DOI :
10.1109/ADPRL.2013.6614996