DocumentCode
1010
Title
Two Online Learning Playout Policies in Monte Carlo Go: An Application of Win/Loss States
Author
Basaldua, Jacques ; Stewart, Steven ; Moreno-Vega, J. Marcos ; Drake, Peter D.
Author_Institution
Dept. de Estadistica IO y Comput., Univ. de La Laguna, La Laguna, Spain
Volume
6
Issue
1
fYear
2014
fDate
Mar-14
Firstpage
46
Lastpage
54
Abstract
Recently, Monte Carlo tree search (MCTS) has become the dominant algorithm in Computer Go. This paper compares two simulation algorithms known as playout policies. The base policy includes some mandatory domain-specific knowledge such as seki and urgency patterns, but is still simple to implement. The more advanced learning policy combines two different learning algorithms with those implemented in the base policy. This policy makes use of win/loss states (WLSs) to learn win rates for large sets of features. A very large experimental series of 7960 games includes results for different board sizes, in self-play and against a reference opponent: Fuego. Results are given for equal numbers of simulations and equal central processing unit (CPU) allocation. The improvement is around 100 Elo points, even with equal CPU allocation, and it increases with the number of simulations. Analyzing the proportion of moves generated by each part of the policy and the individual impact of each part provides further insight on how the policy is learning.
Keywords
Monte Carlo methods; computer games; learning (artificial intelligence); tree searching; CPU; Elo points; FUEGO; MCTS; Monte Carlo Go; Monte Carlo tree search; central processing unit allocation; computer Go; learning algorithms; learning policy; mandatory domain-specific knowledge; online learning playout policies; seki patterns; urgency patterns; win-loss states; Computational modeling; Context; Games; Monte Carlo methods; Resource management; Shape; Tracking; Knowledge discovery; Monte Carlo methods; statistical learning; stochastic systems;
fLanguage
English
Journal_Title
Computational Intelligence and AI in Games, IEEE Transactions on
Publisher
ieee
ISSN
1943-068X
Type
jour
DOI
10.1109/TCIAIG.2013.2292565
Filename
6675777
Link To Document