DocumentCode
3053282
Title
Learning partially observable Markov decision model with EM algorithm
Author
Hui Tan ; Shaohui Ma
Author_Institution
Sch. of Econ. & Manage., Jiangsu Univ. of Sci. & Technol., Zhenjiang, China
fYear
2013
fDate
23-25 Oct. 2013
Firstpage
1
Lastpage
4
Abstract
Most of existing researches focus on POMDP modeling or solution. But in some study fields, before obtaining optimal policy from a POMDP, we need first learning a POMDP model from history data. Assumed that history data including observation sequence and action sequence, the state sequence are unobservable, we derive necessary formulas for using EM Algorithm to estimate the parameters of a POMDP model, including the initial state distribution, stochastic transition matrix and observation probability function.
Keywords
Markov processes; expectation-maximisation algorithm; matrix algebra; probability; EM algorithm; POMDP; action sequence; observation probability function; observation sequence; partially observable Markov decision model; state distribution; state sequence; stochastic transition matrix; Data models; Equations; Hidden Markov models; History; Markov processes; Mathematical model; Prediction algorithms; EM Algorithm; HMM; POMDP Model;
fLanguage
English
Publisher
ieee
Conference_Titel
Application of Information and Communication Technologies (AICT), 2013 7th International Conference on
Conference_Location
Baku
Print_ISBN
978-1-4673-6419-5
Type
conf
DOI
10.1109/ICAICT.2013.6722740
Filename
6722740
Link To Document