DocumentCode :
561191
Title :
A Non-parametric Approach to Approximate Dynamic Programming
Author :
Glaude, Hadrien ; Akrimi, Fadi ; Geist, Matthieu ; Pietquin, Olivier
Author_Institution :
SUPELEC (IMS Res. Group), Metz, France
Volume :
1
fYear :
2011
fDate :
18-21 Dec. 2011
Firstpage :
317
Lastpage :
322
Abstract :
Approximate Dynamic Programming (ADP) is a machine learning method aiming at learning an optimal control policy for a dynamic and stochastic system from a logged set of observed interactions between the system and one or several non-optimal controlers. It defines a class of particular Reinforcement Learning (RL) algorithms which is a general paradigm for learning such a control policy from interactions. ADP addresses the problem of systems exhibiting a state space which is too large to be enumerated in the memory of a computer. Because of this, approximation schemes are used to generalize estimates over continuous state spaces. Nevertheless, RL still suffers from a lack of scalability to multidimensional continuous state spaces. In this paper, we propose the use of the Locally Weighted Projection Regression (LWPR) method to handle this scalability problem. We prove the efficacy of our approach on two standard benchmarks modified to exhibit larger state spaces.
Keywords :
approximation theory; dynamic programming; learning (artificial intelligence); optimal control; regression analysis; stochastic systems; approximate dynamic programming; approximation scheme; dynamic system; locally weighted projection regression method; machine learning method; multidimensional continuous state space; nonparametric approach; optimal control policy learning; reinforcement learning algorithm; scalability problem; state space system; stochastic system; Approximation algorithms; Function approximation; Measurement; Noise; Training; Trajectory;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Applications and Workshops (ICMLA), 2011 10th International Conference on
Conference_Location :
Honolulu, HI
Print_ISBN :
978-1-4577-2134-2
Type :
conf
DOI :
10.1109/ICMLA.2011.19
Filename :
6146991
Link To Document :
بازگشت