DocumentCode :
2032767
Title :
Domain-dependent option policies in autonomous robot learning
Author :
Friske, Letiícia M. ; Ribeiro, Carlos H C
Author_Institution :
Div. de Ciencia da Computacao, Instituto Tecnologico de Aeronautica, Sao Jose dos Campos, Brazil
fYear :
2001
fDate :
2001
Firstpage :
68
Lastpage :
72
Abstract :
In control-related applications such as robotics, determination of optimal solutions is made very difficult for many reasons. Among these stands the difficulty in finding out an appropriate model of the domain, as defined by the control agent (robot), environment where it acts and their interaction. Reinforcement learning is a theory which defines a collection of algorithms for determination of control actions under model-free assumptions, which allows control agents to learn optimal actions in an autonomous way. In reinforcement learning, a cost functional to be optimised is determined in advance. The agent then learns how to perform this optimisation via trial and error on its environment. A trial corresponds to execution of actions chosen by the agent, and the error is the immediate result (a real-valued reinforcement) of this action. In the work reported, we consider trials by a learning robotic agent which are not based on low level actions, but instead on sequences of actions (options or macro-operators). We analysed the performance both in terms of learning speed and quality of learned control-for options that correspond to mappings from states to action policies (OΠ options). Experimental results show that careful (domain-dependent) selection of options (via methods such as discretised potential fields) produce much faster learning for option-based robots when compared to their action-based counterparts. Of critical importance, however, is the option mapping in regions of the state space where the options are not assumed to be necessary: as performance of reinforcement learning algorithms is strongly dependent on sufficient exploration of the state space, even in such regions a careful, ad-hoc selection of actions is of foremost importance
Keywords :
learning (artificial intelligence); robots; software agents; autonomous robot learning; control actions; control agent; cost functional; domain-dependent option policies; model-free assumptions; optimal action learning; optimisation; reinforcement learning; Robots;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Science Society, 2001. SCCC '01. Proceedings. XXI Internatinal Conference of the Chilean
Conference_Location :
Punta Arenas
ISSN :
1522-4902
Print_ISBN :
0-7695-1396-4
Type :
conf
DOI :
10.1109/SCCC.2001.972633
Filename :
972633
Link To Document :
بازگشت