Title :
Enhancing the episodic natural actor-critic algorithm by a regularisation term to stabilize learning of control structures
Author :
Witsch, Andreas ; Reichle, Roland ; Geihs, Kurt ; Lange, Sascha ; Riedmiller, Martin
Author_Institution :
Distrib. Syst. Group, Univ. Kassel, Kassel, Germany
Abstract :
Incomplete or imprecise models of control systems make it difficult to find an appropriate structure and parameter set for a corresponding control policy. These problems are addressed by reinforcement learning algorithms like policy gradient methods. We describe how to stabilise the policy gradient descent by introducing a regularisation term to enhance the episodic natural actor-critic approach. This allows a more policy independent usage. We used the resulting algorithm to optimise a z-transformed rational function representing the control policy. This representation facilitates simultaneous optimisation of the control structure and its parameters in time space and can be analysed in terms of control theory to predict the control behaviour for arbitrary scenarios. Furthermore we present a solution to the general problem of finding a initial parameter set with the help of a single demonstrated trajectory. The approach is evaluated on a cartpole simulation for demonstrating the expressiveness of the policy. Furthermore, a real soccer robot scenario demonstrates the ability of the proposed approach to deal with real world scenarios.
Keywords :
adaptive control; gradient methods; learning (artificial intelligence); learning systems; mobile robots; stability; arbitrary scenarios; control behaviour; control structure learning stability; episodic natural actor-critic algorithm; policy gradient methods; regularisation term; reinforcement learning algorithms; soccer robot scenario; z-transformed rational function; Approximation algorithms; Equations; Estimation; Function approximation; Mathematical model; Trajectory; Transfer functions;
Conference_Titel :
Adaptive Dynamic Programming And Reinforcement Learning (ADPRL), 2011 IEEE Symposium on
Conference_Location :
Paris
Print_ISBN :
978-1-4244-9887-1
DOI :
10.1109/ADPRL.2011.5967352