Title :
Optimism-driven exploration for nonlinear systems
Author :
Moldovan, Teodor Mihai ; Levine, Sergey ; Jordan, Michael I. ; Abbeel, Pieter
Author_Institution :
Dept. of Electr. Eng. & Comput. Sci., Univ. of California, Berkeley, Berkeley, CA, USA
Abstract :
Tasks with unknown dynamics and costly system interaction time present a serious challenge for reinforcement learning. If a model of the dynamics can be learned quickly, interaction time can be reduced substantially. We show that combining an optimistic exploration strategy with model-predictive control can achieve very good sample complexity for a range of nonlinear systems. Our method learns a Dirichlet process mixture of linear models using an exploration strategy based on optimism in the face of uncertainty. Trajectory optimization is used to plan paths in the learned model that both minimize the cost and perform exploration. Experimental results show that our approach achieves some of the most sample-efficient learning rates on several benchmark problems, and is able to successfully learn to control a simulated helicopter during hover and autorotation with only seconds of interaction time. The computational requirements are substantial.
Keywords :
aircraft control; computational complexity; cost reduction; helicopters; hovercraft; learning (artificial intelligence); nonlinear control systems; predictive control; trajectory optimisation (aerospace); Dirichlet process mixture; autorotation; benchmark problems; cost minimization; hover; interaction time reduction; learned model; linear model; model predictive control; nonlinear system; optimism driven exploration strategy; path planning; reinforcement learning; sample complexity; sample efficient learning rate; simulated helicopter control; trajectory optimization; uncertainty; unknown system dynamics; Computational modeling; Heuristic algorithms; Least squares approximations; Optimization; Trajectory; Uncertainty;
Conference_Titel :
Robotics and Automation (ICRA), 2015 IEEE International Conference on
Conference_Location :
Seattle, WA
DOI :
10.1109/ICRA.2015.7139645