Title :
About Q-values of Monte Carlo method
Author_Institution :
Dept. of Electron. & Inf., Ryukoku Univ., Otsu
Abstract :
Profit sharing method is one of the reinforcement learning methods. Profit sharing can work well on the partially observable Markov decision processes (POMDPs). Because it is the typical non-bootstrap method, and itpsilas Q-value is usually handled accumulative. Profit sharing, however, does not work well on the probabilistic state transition. This paper we propose the novel learning method which can work well on the probabilistic state transition. It is similar to the Monte Carlo method. So we discuss about Q-values of our proposed method. In the environment with deterministic state transitions, we show the same performance both the conventional profit sharing and proposed method. And show the good performance of proposed method against the conventional profit sharing.
Keywords :
Markov processes; Monte Carlo methods; learning (artificial intelligence); Monte Carlo method; Q-values; nonbootstrap method; partially observable Markov decision processes; probabilistic state transition; profit sharing method; Boltzmann distribution; Equations; Informatics; Learning systems; Robustness; State estimation; Monte Carlo method; Profit Sharing method; Reinforecement Learning;
Conference_Titel :
SICE Annual Conference, 2008
Conference_Location :
Tokyo
Print_ISBN :
978-4-907764-30-2
Electronic_ISBN :
978-4-907764-29-6
DOI :
10.1109/SICE.2008.4654996