مرکز منطقه ای اطلاع رساني علوم و فناوري - Q-learning with generalisation: an architecture for real-world reinforcement learning in a mobile robot

DocumentCode :

3565740

Title :

Q-learning with generalisation: an architecture for real-world reinforcement learning in a mobile robot

Author :

Holland, Owen ; Snaith, Martin

Author_Institution :

Artificial Life Technol., Rodborough, UK

Volume :

fYear :

1992

Firstpage :

287

Abstract :

It is noted that the time constraints imposed by using real robots rather than simulations are so severe that only architectures giving learning which is efficient in terms of elapsed time and number of trials can be used. Arguments are presented to support the view that multilayer perceptrons are inappropriate because of the extent to which new learning interferes with old learning. The structure of C.J.C.H. Watkins´s Q-learning (1989), a discrete-state and discrete-time reinforcement learning scheme closely related to dynamic programming and capable of a connectionist interpretation, is shown to be suitable, and refinements are proposed to permit generalization and to further protect information. A simple representation of the unlearned components of internal states (perception-action sequence, or PAS, encoding) in terms of the recent history of perceptions and actions is proposed for use in navigating between landmarks in environments where landmarks are rare. A recently developed behavior-based mobile robot (FRANK) is described which has a neurally based perception mechanism known to operate reliably in an unstructured human environment and an onboard computer to implement the modified Q algorithm and PAS encoding

Keywords :

dynamic programming; generalisation (artificial intelligence); learning (artificial intelligence); mobile robots; neural nets; FRANK; PAS; Q-learning; behavior-based mobile robot; connectionist interpretation; discrete-state; discrete-time; dynamic programming; encoding; generalisation; internal states; landmark navigation; mobile robot; neurally based perception; onboard computer; perception-action sequence; real-world reinforcement learning; time constraints; Dynamic programming; Encoding; History; Humans; Learning; Mobile robots; Multilayer perceptrons; Navigation; Protection; Time factors;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Neural Networks, 1992. IJCNN., International Joint Conference on

Print_ISBN :

0-7803-0559-0

Type :

conf

DOI :

10.1109/IJCNN.1992.287120

Filename :

287120

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3565740