Title :
Locally weighted least squares policy iteration for model-free learning in uncertain environments
Author :
Howard, Michael ; Nakamura, Yoshihiko
Author_Institution :
Dept. Inf., Kings Coll. London, London, UK
Abstract :
This paper introduces Locally Weighted Least Squares Policy Iteration for learning approximate optimal control in settings where models of the dynamics and cost function are either unavailable or hard to obtain. Building on recent advances in Least Squares Temporal Difference Learning, the proposed approach is able to learn from data collected from interactions with a system, in order to build a global control policy based on localised models of the state-action value function. Evaluations are reported characterising learning performance for non-linear control problems including an under-powered pendulum swing-up task, and a robotic door-opening problem under different dynamical conditions.
Keywords :
doors; iterative methods; learning systems; least squares approximations; nonlinear control systems; optimal control; pendulums; robots; approximate optimal control learning; cost function; dynamics; global control policy; least squares temporal difference learning; locally weighted least squares policy iteration; model-free learning; nonlinear control problems; robotic door-opening problem; state-action value function; uncertain environments; under-powered pendulum swing-up task; Computational modeling; Data models; Least squares approximations; Robot sensing systems; Trajectory;
Conference_Titel :
Intelligent Robots and Systems (IROS), 2013 IEEE/RSJ International Conference on
Conference_Location :
Tokyo
DOI :
10.1109/IROS.2013.6696506