DocumentCode :
568067
Title :
Applying temporal difference learning to acquire a high-performance position evaluation function
Author :
Yin, Hong-Feng ; Fu, Ting-Ting
Author_Institution :
Dept. of Comput. Sci., Beijing JiaoTong Univ. Haibin Coll., Beijing, China
fYear :
2012
fDate :
14-17 July 2012
Firstpage :
80
Lastpage :
84
Abstract :
Chinese-Chess is more complex board game than International-Chess, it is usually played on a square grid containing 10 × 9 intersections. In Chinese-Chess Computer Game (CCCG), the most time-consuming aspect of building a high performance game playing program is the design, implementation and tuning of the position evaluation function. In this paper, a three-layer fully-connected feed forward neural network is designed as a position evaluation function. Temporal difference learning (TDL) is a reinforcement learning algorithm, which uses the difference of a pair of successive position-values to incrementally update the weights. Based on the three-layer neural network with single output, we derive a new weight updating rule for applying TD(λ) in CCCG. Starting with random initial weights between -0.5 and 0.5, the neural network is trained through the new rule on the grand-master database games. In the training process, each grand-master game is learned iteratively by the neural network until the evaluation value of the position in grand-master game becomes stable ultimately. In the experiments, we validate that our learned evaluation function is feasible and effective.
Keywords :
computer games; feedforward neural nets; learning (artificial intelligence); CCCG; Chinese-chess computer game; complex board game; feed forward neural network; grand-master database games; high performance game playing program; high-performance position evaluation function; international-chess; random initial weights; reinforcement learning algorithm; square grid; temporal difference learning; Computers; Databases; Educational institutions; Games; Law; Neural networks; Chinese-chess computer game; evaluation function; grand-master game database; neural network; temporal difference learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Science & Education (ICCSE), 2012 7th International Conference on
Conference_Location :
Melbourne, VIC
Print_ISBN :
978-1-4673-0241-8
Type :
conf
DOI :
10.1109/ICCSE.2012.6295031
Filename :
6295031
Link To Document :
بازگشت