DocumentCode
401704
Title
Adjusting backup-length automatically in reinforcement learning
Author
Ohta, Masayukl ; Noda, Itsuki
Author_Institution
Cyber Assist Res. Center, Nat. Inst. of Adv. Ind. Sci. & Technol., Tokyo, Japan
Volume
3
fYear
2003
fDate
2-5 Nov. 2003
Firstpage
1624
Abstract
Reinforcement learning agents often acquire wrong action-values in some states when the environment has problem such as perceptual aliasing. Especially, this is a serious problem for reinforcement learning that uses bootstrapping, because it propagates wrong action-values to other states. To solve this problem, we propose DBLA in which the agent skips aliased states and does backup from the first non-aliased state. We demonstrate effectiveness of DBLA in an example of a grid-world maze. The result shows that the influence of the wrong action-values is reduced very much with this method.
Keywords
Markov processes; learning (artificial intelligence); software agents; bootstrapping; dynamic backup-length adjustment; grid-world maze; learning agents; nonaliased state; partially observable Markov decision processes; perceptual aliasing; reinforcement learning; Boltzmann distribution; Cybernetics; Degradation; Machine learning;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Cybernetics, 2003 International Conference on
Print_ISBN
0-7803-8131-9
Type
conf
DOI
10.1109/ICMLC.2003.1259756
Filename
1259756
Link To Document