مرکز منطقه ای اطلاع رساني علوم و فناوري - Speeding up autonomous learning by using state-independent option policies and termination improvement

DocumentCode :

3208845

Title :

Speeding up autonomous learning by using state-independent option policies and termination improvement

Author :

Friske, Letícia Maria ; Ribeiro, Carlos Henrique Costa

Author_Institution :

Divisao de Ciencia da Computacao, Instituto Tecnologico de Aeronautica, Sao Jose Dos Campos, Brazil

fYear :

2002

fDate :

2002

Firstpage :

262

Lastpage :

267

Abstract :

In reinforcement learning applications such as autonomous robot navigation, the use of options (macro-operators) instead of low level actions has been reported to produce learning speedup due to a more aggressive exploration of the state space. In this paper we present an evaluation of the use of option policies O_S. Each option policy in this framework is a fixed sequence of actions, depending exclusively on the state in which the option is initiated. This contrasts with option policies O_Π, more common in the literature and that correspond to action sequences that depend on the states visited during the execution of the options. One of our goals was to analyse the effects of a variation of the action sequence length for O_S policies. The main contribution of the paper, however, is a study on the use of a termination improvement (TI) technique which allows for the abortion of option execution if a more promising one is found. Experimental results show that TI for O_S options, whose benefits had already been reported for O_Π options, can be much more effective - due to its adaptation of the size of the action sequence depending on the state where the option is initiated - than indiscriminately augmenting the option size in order to increase exploration of the state space.

Keywords :

Markov processes; decision theory; learning (artificial intelligence); mobile robots; navigation; Markov decision process; Q-learning; action sequence; mobile robot; navigation; option policy; reinforcement learning; termination improvement technique; Abortion; Convergence; Learning; Navigation; Neural networks; Orbital robotics; State feedback; State-space methods; Uncertainty;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Neural Networks, 2002. SBRN 2002. Proceedings. VII Brazilian Symposium on

Print_ISBN :

0-7695-1709-9

Type :

conf

DOI :

10.1109/SBRN.2002.1181488

Filename :

1181488

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3208845