Reinforcement learning in a behaviour-based control architecture for marine archaeology

Author

Gordon Frost;Francesco Maurelli;David M Lane

Author_Institution

Ocean Systems Laboratory, School of Engineering &

fYear

2015

fDate

5/1/2015 12:00:00 AM

Firstpage

1

Lastpage

5

Abstract

We present a novel path planner for adaptive behaviour of an Autonomous Underwater Vehicle (AUV). A behaviour-based architecture forms the foundation of the system with an extra layer which uses experience to learn a policy for modulating the behaviours´ weights. In effect, this creates an abstract environment for the Reinforement Learning (RL) agent´s state and action space. Subsequently, it simplifies the problem the RL agent is addressing, creating a more stable system. The Episodic Natural Actor Critic (ENAC) RL algorithm is used due to the continuous input and output domains and for the natural actor critic´s convergence properties. Adaptiveness of the system is presented in a thruster failure scenario. RL is used in this failure scenario to learn an appropriate policy for the behaviours´ weights under the new vehicle dynamics. We apply this control architecture to the domain of marine archaeology which has an inherent problem of navigation in unknown, potentially complex and dangerous environments. Simulated results of the proposed control architecture demonstrate its feasibility and performance.

Keywords

"Vehicles","Modulation","Surges","Learning (artificial intelligence)","Machine learning algorithms","Robots","Adaptive systems"

Publisher

ieee

Conference_Titel

OCEANS 2015 - Genova

Type

conf

DOI

10.1109/OCEANS-Genova.2015.7271619

Filename

7271619