Adaptive model learning method for reinforcement learning

Author

Hwang, Kao-Shing ; Jiang, Wei-Cheng ; Chen, Yu-Jen

Author_Institution

Dept. of Electr. Eng., Univ. of Sun Yat-Sen, Kaohsiung, Taiwan

fYear

2012

fDate

20-23 Aug. 2012

Firstpage

1277

Lastpage

1280

Abstract

The original Q-learning method is difficult on achieving sample efficiency such as training a policy to get to a goal with in limited time step. So, the Dyna-Q agent is proposed to speed up the policy learning. However, the Dyna-Q did not specify how to build the model, so the table is used to be the model largely. In this paper, we proposed an adaptive model learning method based on tree structures and combined with Q-Learning to form Tree-Based Dyna-Q agent to enhance the policy learning. When the tree-based model learns an accurate model, a planning method can use the model to produce simulated experiences to accelerate value iterations. Thus, the agent with the proposed method can obtain virtual experiences for updating the policy. The simulation result shows that training time of our method can improve obviously.

Keywords

iterative methods; learning (artificial intelligence); trees (mathematics); adaptive model learning method; original Q-learning method; planning method; policy learning; reinforcement learning; sample efficiency; tree-based Dyna-Q agent; tree-based model; value iterations; Adaptation models; Educational institutions; Learning; Learning systems; Planning; Silicon; Training; Dyna-Q agent; Reinforcement learning; adaptive model learning method;

fLanguage

English

Publisher

ieee

Conference_Titel

SICE Annual Conference (SICE), 2012 Proceedings of

Conference_Location

Akita

ISSN

pending

Print_ISBN

978-1-4673-2259-1

Type

conf

Filename

6318643