Title :
Tree-based Dyna-Q agent
Author :
Hwang, Kao-Shing ; Jiang, Wei-Cheng ; Chen, Yu-Jen
Abstract :
This article presented a Dyna-Q learning method based on a world model of tree structures to enhance the efficiency on sampling data in reinforcement learning problem. The Q-Learning mechanism is for policy learning as the tree is learning the world model by observing the transitions between the states after the actions taken. In early stages of learning, the learning agent does not have an accurate model but explores the environment as possible to collect sufficient experiences to approximate the environment model. When the agent develops a more accurate model, a planning method can use the model to produce simulated experiences to accelerate value iterations. Thus, the agent with the proposed method can obtain virtual experiences for updating the policy. Simulations on a mobile robot escaping from a labyrinth to verify the performance of the robot equipped with the proposed method. The result proves that tree-based Dyna-Q agent can speed up the learning process.
Keywords :
iterative methods; learning (artificial intelligence); mobile robots; multi-agent systems; multi-robot systems; path planning; sampling methods; tree data structures; Dyna-Q learning method; environment model; labyrinth; mobile robot; planning method; policy learning; reinforcement learning; sampling data; simulated experiences; tree structures; tree-based Dyna-Q agent; value iterations acceleration; Algorithm design and analysis; Educational institutions; Electrical engineering; Learning; Mobile robots; Planning; Training;
Conference_Titel :
Advanced Intelligent Mechatronics (AIM), 2012 IEEE/ASME International Conference on
Conference_Location :
Kachsiung
Print_ISBN :
978-1-4673-2575-2
DOI :
10.1109/AIM.2012.6266001