• DocumentCode
    401704
  • Title

    Adjusting backup-length automatically in reinforcement learning

  • Author

    Ohta, Masayukl ; Noda, Itsuki

  • Author_Institution
    Cyber Assist Res. Center, Nat. Inst. of Adv. Ind. Sci. & Technol., Tokyo, Japan
  • Volume
    3
  • fYear
    2003
  • fDate
    2-5 Nov. 2003
  • Firstpage
    1624
  • Abstract
    Reinforcement learning agents often acquire wrong action-values in some states when the environment has problem such as perceptual aliasing. Especially, this is a serious problem for reinforcement learning that uses bootstrapping, because it propagates wrong action-values to other states. To solve this problem, we propose DBLA in which the agent skips aliased states and does backup from the first non-aliased state. We demonstrate effectiveness of DBLA in an example of a grid-world maze. The result shows that the influence of the wrong action-values is reduced very much with this method.
  • Keywords
    Markov processes; learning (artificial intelligence); software agents; bootstrapping; dynamic backup-length adjustment; grid-world maze; learning agents; nonaliased state; partially observable Markov decision processes; perceptual aliasing; reinforcement learning; Boltzmann distribution; Cybernetics; Degradation; Machine learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics, 2003 International Conference on
  • Print_ISBN
    0-7803-8131-9
  • Type

    conf

  • DOI
    10.1109/ICMLC.2003.1259756
  • Filename
    1259756