DocumentCode
63758
Title
Real-World Reinforcement Learning via Multifidelity Simulators
Author
Cutler, Mark ; Walsh, Thomas J. ; How, Jonathan P.
Author_Institution
Lab. of Inf. & Decision Syst., Massachusetts Inst. of Technol., Cambridge, MA, USA
Volume
31
Issue
3
fYear
2015
fDate
Jun-15
Firstpage
655
Lastpage
671
Abstract
Reinforcement learning (RL) can be a tool for designing policies and controllers for robotic systems. However, the cost of real-world samples remains prohibitive as many RL algorithms require a large number of samples before learning useful policies. Simulators are one way to decrease the number of required real-world samples, but imperfect models make deciding when and how to trust samples from a simulator difficult. We present a framework for efficient RL in a scenario where multiple simulators of a target task are available, each with varying levels of fidelity. The framework is designed to limit the number of samples used in each successively higher-fidelity/cost simulator by allowing a learning agent to choose to run trajectories at the lowest level simulator that will still provide it with useful information. Theoretical proofs of the framework´s sample complexity are given and empirical results are demonstrated on a remote-controlled car with multiple simulators. The approach enables RL algorithms to find near-optimal policies in a physical robot domain with fewer expensive real-world samples than previous transfer approaches or learning without simulators.
Keywords
learning (artificial intelligence); telerobotics; RL algorithm; higher-fidelity-cost simulator; lowest level simulator; multifidelity simulator; reinforcement learning; remote-controlled car; robotic system; Accuracy; Complexity theory; Data models; Learning (artificial intelligence); Mathematical model; Optimization; Robots; Animation and simulation; autonomous agents; learning and adaptive systems; reinforcement learning (RL);
fLanguage
English
Journal_Title
Robotics, IEEE Transactions on
Publisher
ieee
ISSN
1552-3098
Type
jour
DOI
10.1109/TRO.2015.2419431
Filename
7106543
Link To Document