The Q(λ) algorithm based on heuristic reward function

Author

Zhang, Jianhong ; Shi, Ying ; Xie, Xiaofei

Author_Institution

Sch. of Inf. & Eng., Huzhou Teachers´´ Coll., Huzhou, China

fYear

2010

fDate

13-15 Aug. 2010

Firstpage

139

Lastpage

142

Abstract

For reinforcement learning often show slow convergence speed problem in continuous and complex tasks, this paper proposes a Q(λ) algorithm based on heuristic reward function-Q(λ)-HRF algorithm. This algorithm can extract features from the environment and get the heuristic information, which can be applied to the study by Agent in the form of reward function, which can accelerate the convergence speed significantly. We also proved the convergence of the algorithm by mathematical way, and applied the algorithm to the Maze platform, the experimental results show that: the Q(λ)-HRF algorithm has better convergence speed than Q(λ) algorithm.

Keywords

learning (artificial intelligence); HRF algorithm; Maze platform; Q(λ) algorithm; complex task; continuous task; convergence speed problem; feature extraction; heuristic reward function; reinforcement learning; Algorithm design and analysis; Convergence; Feature extraction; Heuristic algorithms; Learning; Machine learning algorithms; Markov processes;

fLanguage

English

Publisher

ieee

Conference_Titel

Intelligent Control and Information Processing (ICICIP), 2010 International Conference on

Conference_Location

Dalian

Print_ISBN

978-1-4244-7047-1

Type

conf

DOI

10.1109/ICICIP.2010.5564220

Filename

5564220

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=1942944