Title :
Research of a heuristic reward function for reinforcement learning algorithms
Author :
Wei, Yingzi ; Zhao, Mingyang ; Feng Zhang ; Hu, Yulan
Author_Institution :
Shenyang Inst. of Autom., Chinese Acad. of Sci., Shenyang, China
Abstract :
The reward function is considered as the critical component for its effect of evaluating the action and guiding the reinforcement learning (RL) process. According to the distribution of rewards in the space of states, reward functions can have two basic forms, dense and sparse. We present an idea of designing a heuristic reward function in this paper. An additional reward is added to the traditional sparse reward function. The additional reward function F is a difference of potentials, which can provide more heuristic information for the learning system to progress rapidly. We also prove the convergence property of Q-value iteration. The heuristic reward function helps to implement an efficient reinforcement learning system on a real-time control or scheduling system.
Keywords :
Markov processes; convergence of numerical methods; decision theory; iterative methods; learning (artificial intelligence); learning systems; optimisation; Markov processes; Q-value iteration; additional reward function; convergence property; dense functions; heuristic reward function; real time control; real time scheduling system; reinforcement learning algorithms; sparse reward function; Control systems; Convergence; Learning systems; Real time systems; Space technology;
Conference_Titel :
Intelligent Control and Automation, 2004. WCICA 2004. Fifth World Congress on
Print_ISBN :
0-7803-8273-0
DOI :
10.1109/WCICA.2004.1342083