Title :
Learning partially observable Markov decision model with EM algorithm
Author :
Hui Tan ; Shaohui Ma
Author_Institution :
Sch. of Econ. & Manage., Jiangsu Univ. of Sci. & Technol., Zhenjiang, China
Abstract :
Most of existing researches focus on POMDP modeling or solution. But in some study fields, before obtaining optimal policy from a POMDP, we need first learning a POMDP model from history data. Assumed that history data including observation sequence and action sequence, the state sequence are unobservable, we derive necessary formulas for using EM Algorithm to estimate the parameters of a POMDP model, including the initial state distribution, stochastic transition matrix and observation probability function.
Keywords :
Markov processes; expectation-maximisation algorithm; matrix algebra; probability; EM algorithm; POMDP; action sequence; observation probability function; observation sequence; partially observable Markov decision model; state distribution; state sequence; stochastic transition matrix; Data models; Equations; Hidden Markov models; History; Markov processes; Mathematical model; Prediction algorithms; EM Algorithm; HMM; POMDP Model;
Conference_Titel :
Application of Information and Communication Technologies (AICT), 2013 7th International Conference on
Conference_Location :
Baku
Print_ISBN :
978-1-4673-6419-5
DOI :
10.1109/ICAICT.2013.6722740