DocumentCode :
2864642
Title :
An Optimal Strategy Learning for RoboCup in Continuous State Space
Author :
Junyuan, Tao ; Desheng, Li
Author_Institution :
Dept. of Autom. Measurement & Control, Harbin Inst. of Technol.
fYear :
2006
fDate :
25-28 June 2006
Firstpage :
301
Lastpage :
305
Abstract :
RoboCup offers a set of challenges for machine learning researchers because it is a dynamic, nondeterministic, goal delayed and continuous state space problem. Reinforcement learning (RL) is often used for strategy learning in RoboCup, which is a method to learn an optimal control policy for sequential decision-making problems. But it is difficult to apply RL to continuous state space problems because of the exponential growth of states in the number of state variables. An effective method is to combine RL with function approximation. However, this combination sometimes leads to diverge. In this paper, we analyze the main reason that cause the non-convergent of the current approximation RL algorithms and propose an optimal strategy learning method. The two processes - value evaluation and policy improvement in RL have been separated. Policy search process is controlled strictly in the direction of improving performance according the evaluation value provided by the value function. And we apply this algorithm to a standard RoboCup sub-problem-Keepaway successfully. Experiment result has verified the effective of the method and showed the algorithm could converge to a local optimal policy
Keywords :
continuous time systems; decision making; function approximation; learning (artificial intelligence); mobile robots; multi-robot systems; optimal control; state-space methods; RoboCup; continuous state space; function approximation; machine learning; optimal strategy learning; reinforcement learning; sequential decision-making; Algorithm design and analysis; Approximation algorithms; Decision making; Delay; Function approximation; Learning systems; Machine learning; Optimal control; Process control; State-space methods; RoboCup; function approximation; optimal control policy; reinforcement learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Mechatronics and Automation, Proceedings of the 2006 IEEE International Conference on
Conference_Location :
Luoyang, Henan
Print_ISBN :
1-4244-0465-7
Electronic_ISBN :
1-4244-0466-5
Type :
conf
DOI :
10.1109/ICMA.2006.257531
Filename :
4026098
Link To Document :
بازگشت