Title :
Adjusting backup-length automatically in reinforcement learning
Author :
Ohta, Masayukl ; Noda, Itsuki
Author_Institution :
Cyber Assist Res. Center, Nat. Inst. of Adv. Ind. Sci. & Technol., Tokyo, Japan
Abstract :
Reinforcement learning agents often acquire wrong action-values in some states when the environment has problem such as perceptual aliasing. Especially, this is a serious problem for reinforcement learning that uses bootstrapping, because it propagates wrong action-values to other states. To solve this problem, we propose DBLA in which the agent skips aliased states and does backup from the first non-aliased state. We demonstrate effectiveness of DBLA in an example of a grid-world maze. The result shows that the influence of the wrong action-values is reduced very much with this method.
Keywords :
Markov processes; learning (artificial intelligence); software agents; bootstrapping; dynamic backup-length adjustment; grid-world maze; learning agents; nonaliased state; partially observable Markov decision processes; perceptual aliasing; reinforcement learning; Boltzmann distribution; Cybernetics; Degradation; Machine learning;
Conference_Titel :
Machine Learning and Cybernetics, 2003 International Conference on
Print_ISBN :
0-7803-8131-9
DOI :
10.1109/ICMLC.2003.1259756