مرکز منطقه ای اطلاع رساني علوم و فناوري - Applying temporal difference learning to acquire a high-performance position evaluation function

DocumentCode :

568067

Title :

Applying temporal difference learning to acquire a high-performance position evaluation function

Author :

Yin, Hong-Feng ; Fu, Ting-Ting

Author_Institution :

Dept. of Comput. Sci., Beijing JiaoTong Univ. Haibin Coll., Beijing, China

fYear :

2012

fDate :

14-17 July 2012

Firstpage :

Lastpage :

Abstract :

Chinese-Chess is more complex board game than International-Chess, it is usually played on a square grid containing 10 × 9 intersections. In Chinese-Chess Computer Game (CCCG), the most time-consuming aspect of building a high performance game playing program is the design, implementation and tuning of the position evaluation function. In this paper, a three-layer fully-connected feed forward neural network is designed as a position evaluation function. Temporal difference learning (TDL) is a reinforcement learning algorithm, which uses the difference of a pair of successive position-values to incrementally update the weights. Based on the three-layer neural network with single output, we derive a new weight updating rule for applying TD(λ) in CCCG. Starting with random initial weights between -0.5 and 0.5, the neural network is trained through the new rule on the grand-master database games. In the training process, each grand-master game is learned iteratively by the neural network until the evaluation value of the position in grand-master game becomes stable ultimately. In the experiments, we validate that our learned evaluation function is feasible and effective.

Keywords :

computer games; feedforward neural nets; learning (artificial intelligence); CCCG; Chinese-chess computer game; complex board game; feed forward neural network; grand-master database games; high performance game playing program; high-performance position evaluation function; international-chess; random initial weights; reinforcement learning algorithm; square grid; temporal difference learning; Computers; Databases; Educational institutions; Games; Law; Neural networks; Chinese-chess computer game; evaluation function; grand-master game database; neural network; temporal difference learning;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Computer Science & Education (ICCSE), 2012 7th International Conference on

Conference_Location :

Melbourne, VIC

Print_ISBN :

978-1-4673-0241-8

Type :

conf

DOI :

10.1109/ICCSE.2012.6295031

Filename :

6295031

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=568067