Real-World Reinforcement Learning via Multifidelity Simulators

Author

Cutler, Mark ; Walsh, Thomas J. ; How, Jonathan P.

Author_Institution

Lab. of Inf. & Decision Syst., Massachusetts Inst. of Technol., Cambridge, MA, USA

Volume

31

Issue

3

fYear

2015

fDate

Jun-15

Firstpage

655

Lastpage

671

Abstract

Reinforcement learning (RL) can be a tool for designing policies and controllers for robotic systems. However, the cost of real-world samples remains prohibitive as many RL algorithms require a large number of samples before learning useful policies. Simulators are one way to decrease the number of required real-world samples, but imperfect models make deciding when and how to trust samples from a simulator difficult. We present a framework for efficient RL in a scenario where multiple simulators of a target task are available, each with varying levels of fidelity. The framework is designed to limit the number of samples used in each successively higher-fidelity/cost simulator by allowing a learning agent to choose to run trajectories at the lowest level simulator that will still provide it with useful information. Theoretical proofs of the framework´s sample complexity are given and empirical results are demonstrated on a remote-controlled car with multiple simulators. The approach enables RL algorithms to find near-optimal policies in a physical robot domain with fewer expensive real-world samples than previous transfer approaches or learning without simulators.

Keywords

learning (artificial intelligence); telerobotics; RL algorithm; higher-fidelity-cost simulator; lowest level simulator; multifidelity simulator; reinforcement learning; remote-controlled car; robotic system; Accuracy; Complexity theory; Data models; Learning (artificial intelligence); Mathematical model; Optimization; Robots; Animation and simulation; autonomous agents; learning and adaptive systems; reinforcement learning (RL);

fLanguage

English

Journal_Title

Robotics, IEEE Transactions on

Publisher

ieee

ISSN

1552-3098

Type

jour

DOI

10.1109/TRO.2015.2419431

Filename

7106543