An action-selection strategy insensitive to parameter-settings in reinforcement learning

Author

Ono, Kenji ; Iwata, Kazunori ; Hayashi, Akira

Author_Institution

Grad. Sch. of Inf. Sci., Hiroshima City Univ., Hiroshima, Japan

fYear

2009

fDate

18-21 Aug. 2009

Firstpage

1012

Lastpage

1017

Abstract

Markov decision processes are one of the most popular frameworks for reinforcement learning. The entropy of probability density functions of Markov decision processes is referred to as the stochastic complexity. The stochastic complexity is helpful for tuning the parameters of an action-selection strategy to alleviate the exploration-exploitation dilemma. In this paper, we improve an action-selection strategy to make it insensitive to parameter-settings by using the stochastic complexity. This gives better policies for alleviating the above dilemma in most parameter-settings.

Keywords

Markov processes; entropy; learning (artificial intelligence); Markov decision processes; action-selection strategy; entropy; exploration-exploitation dilemma; parameter tuning; parameter-settings; probability density functions; reinforcement learning; stochastic complexity; Adaptive systems; Entropy; Information theory; Learning; Probability density function; Stochastic processes; Stochastic systems; Markov Decision Process; Reinforcement Learning; Softmax Method;

fLanguage

English

Publisher

ieee

Conference_Titel

ICCAS-SICE, 2009

Conference_Location

Fukuoka

Print_ISBN

978-4-907764-34-0

Electronic_ISBN

978-4-907764-33-3

Type

conf

Filename

5334921