مرکز منطقه ای اطلاع رساني علوم و فناوري - A heuristic Q-learning architecture for fully exploring a world and deriving an optimal policy by model-based planning

DocumentCode :

339239

Title :

A heuristic Q-learning architecture for fully exploring a world and deriving an optimal policy by model-based planning

Author :

Zhao, Gang ; Tatsumi, Shoji ; Sun, Ruoying

Author_Institution :

Fac. of Eng., Osaka City Univ., Japan

Volume :

fYear :

1999

fDate :

1999

Firstpage :

2078

Abstract :

For solving Markov decision processes with incomplete information on robot learning tasks, model-based algorithm makes effective use of gathered data, but usually requires extreme computation. Dyna-Q is an architecture that uses experiences to build a model and uses the model to adjust the policy simultaneously, however, it does not help an agent to explore an environment actively. In, this paper, we present an Exa-Q architecture which learns models and makes plans using learned models to help the reinforcement learning agent explore an environment actively and improve the reinforcement function estimate. As a result, the Exa-Q architecture can identify an environment fully and speed up the learning rate for deriving the optimal policy. Experimental results demonstrate that the proposed method is efficient

Keywords :

Markov processes; heuristic programming; learning (artificial intelligence); optimisation; planning (artificial intelligence); robots; Dyna-Q; Exa-Q architecture; Markov decision processes; heuristic Q-learning architecture; model-based planning; optimal policy; reinforcement learning agent; Business; Computer architecture; Data engineering; Educational institutions; Educational robots; Engineering management; Learning; Orbital robotics; Sun; Training data;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Robotics and Automation, 1999. Proceedings. 1999 IEEE International Conference on

Conference_Location :

Detroit, MI

ISSN :

1050-4729

Print_ISBN :

0-7803-5180-0

Type :

conf

DOI :

10.1109/ROBOT.1999.770413

Filename :

770413

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=339239