DocumentCode
3157953
Title
About Q-values of Monte Carlo method
Author
Uemura, Wataru
Author_Institution
Dept. of Electron. & Inf., Ryukoku Univ., Otsu
fYear
2008
fDate
20-22 Aug. 2008
Firstpage
2035
Lastpage
2038
Abstract
Profit sharing method is one of the reinforcement learning methods. Profit sharing can work well on the partially observable Markov decision processes (POMDPs). Because it is the typical non-bootstrap method, and itpsilas Q-value is usually handled accumulative. Profit sharing, however, does not work well on the probabilistic state transition. This paper we propose the novel learning method which can work well on the probabilistic state transition. It is similar to the Monte Carlo method. So we discuss about Q-values of our proposed method. In the environment with deterministic state transitions, we show the same performance both the conventional profit sharing and proposed method. And show the good performance of proposed method against the conventional profit sharing.
Keywords
Markov processes; Monte Carlo methods; learning (artificial intelligence); Monte Carlo method; Q-values; nonbootstrap method; partially observable Markov decision processes; probabilistic state transition; profit sharing method; Boltzmann distribution; Equations; Informatics; Learning systems; Robustness; State estimation; Monte Carlo method; Profit Sharing method; Reinforecement Learning;
fLanguage
English
Publisher
ieee
Conference_Titel
SICE Annual Conference, 2008
Conference_Location
Tokyo
Print_ISBN
978-4-907764-30-2
Electronic_ISBN
978-4-907764-29-6
Type
conf
DOI
10.1109/SICE.2008.4654996
Filename
4654996
Link To Document