DocumentCode
3269575
Title
Reinforcement learning in the game of Othello: Learning against a fixed opponent and learning from self-play
Author
van der Ree, Michiel ; Wiering, Marco
Author_Institution
Inst. of Artificial Intell. & Cognitive Eng., Univ. of Groningen, Groningen, Netherlands
fYear
2013
fDate
16-19 April 2013
Firstpage
108
Lastpage
115
Abstract
This paper compares three strategies in using reinforcement learning algorithms to let an artificial agent learn to play the game of Othello. The three strategies that are compared are: Learning by self-play, learning from playing against a fixed opponent, and learning from playing against a fixed opponent while learning from the opponent´s moves as well. These issues are considered for the algorithms Q-learning, Sarsa and TD-learning. These three reinforcement learning algorithms are combined with multi-layer perceptrons and trained and tested against three fixed opponents. It is found that the best strategy of learning differs per algorithm. Q-learning and Sarsa perform best when trained against the fixed opponent they are also tested against, whereas TD-learning performs best when trained through self-play. Surprisingly, Q-learning and Sarsa outperform TD-learning against the stronger fixed opponents, when all methods use their best strategy. Learning from the opponent´s moves as well leads to worse results compared to learning only from the learning agent´s own moves.
Keywords
computer games; game theory; learning (artificial intelligence); multi-agent systems; multilayer perceptrons; Othello game; Q-learning algorithm; Sarsa algorithm; TD-learning algorithm; artificial agent; multilayer perceptrons; reinforcement learning algorithms; self-play learning; Artificial neural networks; Games; Heuristic algorithms; Learning (artificial intelligence); Testing; Training;
fLanguage
English
Publisher
ieee
Conference_Titel
Adaptive Dynamic Programming And Reinforcement Learning (ADPRL), 2013 IEEE Symposium on
Conference_Location
Singapore
ISSN
2325-1824
Type
conf
DOI
10.1109/ADPRL.2013.6614996
Filename
6614996
Link To Document