Domain-dependent option policies in autonomous robot learning

Author

Friske, Letiícia M. ; Ribeiro, Carlos H C

Author_Institution

Div. de Ciencia da Computacao, Instituto Tecnologico de Aeronautica, Sao Jose dos Campos, Brazil

fYear

2001

fDate

2001

Firstpage

68

Lastpage

72

Abstract

In control-related applications such as robotics, determination of optimal solutions is made very difficult for many reasons. Among these stands the difficulty in finding out an appropriate model of the domain, as defined by the control agent (robot), environment where it acts and their interaction. Reinforcement learning is a theory which defines a collection of algorithms for determination of control actions under model-free assumptions, which allows control agents to learn optimal actions in an autonomous way. In reinforcement learning, a cost functional to be optimised is determined in advance. The agent then learns how to perform this optimisation via trial and error on its environment. A trial corresponds to execution of actions chosen by the agent, and the error is the immediate result (a real-valued reinforcement) of this action. In the work reported, we consider trials by a learning robotic agent which are not based on low level actions, but instead on sequences of actions (options or macro-operators). We analysed the performance both in terms of learning speed and quality of learned control-for options that correspond to mappings from states to action policies (O_Π options). Experimental results show that careful (domain-dependent) selection of options (via methods such as discretised potential fields) produce much faster learning for option-based robots when compared to their action-based counterparts. Of critical importance, however, is the option mapping in regions of the state space where the options are not assumed to be necessary: as performance of reinforcement learning algorithms is strongly dependent on sufficient exploration of the state space, even in such regions a careful, ad-hoc selection of actions is of foremost importance

Keywords

learning (artificial intelligence); robots; software agents; autonomous robot learning; control actions; control agent; cost functional; domain-dependent option policies; model-free assumptions; optimal action learning; optimisation; reinforcement learning; Robots;

fLanguage

English

Publisher

ieee

Conference_Titel

Computer Science Society, 2001. SCCC '01. Proceedings. XXI Internatinal Conference of the Chilean

Conference_Location

Punta Arenas

ISSN

1522-4902

Print_ISBN

0-7695-1396-4

Type

conf

DOI

10.1109/SCCC.2001.972633

Filename

972633