Title :
Inverse Reinforcement Learning using Expectation Maximization in mixture models
Author :
Hahn, Jurgen ; Zoubir, Abdelhak M.
Author_Institution :
Signal Process. Group, Tech. Univ. Darmstadt, Darmstadt, Germany
Abstract :
Reinforcement Learning (RL) is an attractive tool for learning optimal controllers in the sense of a given reward function. In conventional RL, usually an expert is required to design the reward function as the efficiency of RL strongly depends on the latter. An alternative has been presented by the concept of Inverse Reinforcement Learning (IRL), where the reward function is estimated from observed data. In this work, we propose a novel approach for IRL based on a generative probabilistic model of RL. We derive an Expectation Maximization algorithm that is able to simultaneously estimate the reward and the optimal policy for finite state and action spaces, which can be easily extended for the infinite cases. By means of two toy examples, we show that the proposed algorithm works well even with a low number of observations and converges after only a few iterations.
Keywords :
expectation-maximisation algorithm; learning (artificial intelligence); mixture models; probability; IRL; action spaces; expectation maximization algorithm; finite state spaces; generative probabilistic model; inverse reinforcement learning; mixture models; optimal controllers; optimal policy; reward function; Integrated circuits; Integrated optics; Mixture models; Probabilistic logic; Expectation Maximization; Inverse Reinforcement Learning; Markov Decision Process;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location :
South Brisbane, QLD
DOI :
10.1109/ICASSP.2015.7178666