DocumentCode :
3208845
Title :
Speeding up autonomous learning by using state-independent option policies and termination improvement
Author :
Friske, Letícia Maria ; Ribeiro, Carlos Henrique Costa
Author_Institution :
Divisao de Ciencia da Computacao, Instituto Tecnologico de Aeronautica, Sao Jose Dos Campos, Brazil
fYear :
2002
fDate :
2002
Firstpage :
262
Lastpage :
267
Abstract :
In reinforcement learning applications such as autonomous robot navigation, the use of options (macro-operators) instead of low level actions has been reported to produce learning speedup due to a more aggressive exploration of the state space. In this paper we present an evaluation of the use of option policies OS. Each option policy in this framework is a fixed sequence of actions, depending exclusively on the state in which the option is initiated. This contrasts with option policies OΠ, more common in the literature and that correspond to action sequences that depend on the states visited during the execution of the options. One of our goals was to analyse the effects of a variation of the action sequence length for OS policies. The main contribution of the paper, however, is a study on the use of a termination improvement (TI) technique which allows for the abortion of option execution if a more promising one is found. Experimental results show that TI for OS options, whose benefits had already been reported for OΠ options, can be much more effective - due to its adaptation of the size of the action sequence depending on the state where the option is initiated - than indiscriminately augmenting the option size in order to increase exploration of the state space.
Keywords :
Markov processes; decision theory; learning (artificial intelligence); mobile robots; navigation; Markov decision process; Q-learning; action sequence; mobile robot; navigation; option policy; reinforcement learning; termination improvement technique; Abortion; Convergence; Learning; Navigation; Neural networks; Orbital robotics; State feedback; State-space methods; Uncertainty;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks, 2002. SBRN 2002. Proceedings. VII Brazilian Symposium on
Print_ISBN :
0-7695-1709-9
Type :
conf
DOI :
10.1109/SBRN.2002.1181488
Filename :
1181488
Link To Document :
بازگشت