• DocumentCode
    1010
  • Title

    Two Online Learning Playout Policies in Monte Carlo Go: An Application of Win/Loss States

  • Author

    Basaldua, Jacques ; Stewart, Steven ; Moreno-Vega, J. Marcos ; Drake, Peter D.

  • Author_Institution
    Dept. de Estadistica IO y Comput., Univ. de La Laguna, La Laguna, Spain
  • Volume
    6
  • Issue
    1
  • fYear
    2014
  • fDate
    Mar-14
  • Firstpage
    46
  • Lastpage
    54
  • Abstract
    Recently, Monte Carlo tree search (MCTS) has become the dominant algorithm in Computer Go. This paper compares two simulation algorithms known as playout policies. The base policy includes some mandatory domain-specific knowledge such as seki and urgency patterns, but is still simple to implement. The more advanced learning policy combines two different learning algorithms with those implemented in the base policy. This policy makes use of win/loss states (WLSs) to learn win rates for large sets of features. A very large experimental series of 7960 games includes results for different board sizes, in self-play and against a reference opponent: Fuego. Results are given for equal numbers of simulations and equal central processing unit (CPU) allocation. The improvement is around 100 Elo points, even with equal CPU allocation, and it increases with the number of simulations. Analyzing the proportion of moves generated by each part of the policy and the individual impact of each part provides further insight on how the policy is learning.
  • Keywords
    Monte Carlo methods; computer games; learning (artificial intelligence); tree searching; CPU; Elo points; FUEGO; MCTS; Monte Carlo Go; Monte Carlo tree search; central processing unit allocation; computer Go; learning algorithms; learning policy; mandatory domain-specific knowledge; online learning playout policies; seki patterns; urgency patterns; win-loss states; Computational modeling; Context; Games; Monte Carlo methods; Resource management; Shape; Tracking; Knowledge discovery; Monte Carlo methods; statistical learning; stochastic systems;
  • fLanguage
    English
  • Journal_Title
    Computational Intelligence and AI in Games, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1943-068X
  • Type

    jour

  • DOI
    10.1109/TCIAIG.2013.2292565
  • Filename
    6675777