Title :
Uncertainty propagation for quality assurance in Reinforcement Learning
Author :
Schneegass, Daniel ; Udluft, Steffen ; Martinetz, Thomas
Author_Institution :
Corp. Technol., Inf. & Commun., Learning Syst. Dept., Siemens AG, Munich
Abstract :
In this paper we address the reliability of policies derived by Reinforcement Learning on a limited amount of observations. This can be done in a principled manner by taking into account the derived Q-functionpsilas uncertainty, which stems from the uncertainty of the estimators used for the MDPpsilas transition probabilities and the reward function. We apply uncertainty propagation parallelly to the Bellman iteration and achieve confidence intervals for the Q-function. In a second step we change the Bellman operator as to achieve a policy guaranteeing the highest minimum performance with a given probability. We demonstrate the functionality of our method on artificial examples and show that, for an important problem class even an enhancement of the expected performance can be obtained. Finally we verify this observation on an application to gas turbine control.
Keywords :
iterative methods; learning (artificial intelligence); probability; quality assurance; uncertainty handling; Bellman iteration; Bellman operator; Q-function uncertainty; confidence interval; policy reliability; quality assurance; reinforcement learning; reward function; transition probabilities; uncertainty propagation; Bayesian methods; Communications technology; Function approximation; Gaussian processes; Learning systems; Measurement uncertainty; Optimization methods; Probability; Quality assurance; Turbines;
Conference_Titel :
Neural Networks, 2008. IJCNN 2008. (IEEE World Congress on Computational Intelligence). IEEE International Joint Conference on
Conference_Location :
Hong Kong
Print_ISBN :
978-1-4244-1820-6
Electronic_ISBN :
1098-7576
DOI :
10.1109/IJCNN.2008.4634160