DocumentCode :
663509
Title :
Locally weighted least squares policy iteration for model-free learning in uncertain environments
Author :
Howard, Michael ; Nakamura, Yoshihiko
Author_Institution :
Dept. Inf., Kings Coll. London, London, UK
fYear :
2013
fDate :
3-7 Nov. 2013
Firstpage :
1223
Lastpage :
1229
Abstract :
This paper introduces Locally Weighted Least Squares Policy Iteration for learning approximate optimal control in settings where models of the dynamics and cost function are either unavailable or hard to obtain. Building on recent advances in Least Squares Temporal Difference Learning, the proposed approach is able to learn from data collected from interactions with a system, in order to build a global control policy based on localised models of the state-action value function. Evaluations are reported characterising learning performance for non-linear control problems including an under-powered pendulum swing-up task, and a robotic door-opening problem under different dynamical conditions.
Keywords :
doors; iterative methods; learning systems; least squares approximations; nonlinear control systems; optimal control; pendulums; robots; approximate optimal control learning; cost function; dynamics; global control policy; least squares temporal difference learning; locally weighted least squares policy iteration; model-free learning; nonlinear control problems; robotic door-opening problem; state-action value function; uncertain environments; under-powered pendulum swing-up task; Computational modeling; Data models; Least squares approximations; Robot sensing systems; Trajectory;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Robots and Systems (IROS), 2013 IEEE/RSJ International Conference on
Conference_Location :
Tokyo
ISSN :
2153-0858
Type :
conf
DOI :
10.1109/IROS.2013.6696506
Filename :
6696506
Link To Document :
بازگشت