• DocumentCode
    2858321
  • Title

    Reinforcement learning without an explicit terminal state

  • Author

    Riedmiller, Martin

  • Author_Institution
    Inst. fur Logik, Komplexitat und Deduktionssyst., Karlsruhe Univ., Germany
  • Volume
    3
  • fYear
    1998
  • fDate
    4-9 May 1998
  • Firstpage
    1998
  • Abstract
    Introduces a reinforcement learning framework based on dynamic programming for a class of control problems, where no explicit terminal state exists. This situation especially occurs in the context of technical process control: the control task is not terminated once a predefined target value is reached, but instead the controller has to continue to control the system in order to avoid the system´s output drifting away from its target value again. We propose a set of assumptions and give a proof for the convergence of the value iteration method. From this a new algorithm, which we call the fixed horizon algorithm, is derived. The performance of the proposed algorithm is compared to an approach that assumes the existence of an explicit terminal state. The application to a cart/double pole-system finally shows the application to a difficult practical control task
  • Keywords
    convergence; dynamic programming; iterative methods; learning (artificial intelligence); neurocontrollers; position control; process control; self-adjusting systems; cart/double pole-system; dynamic programming; fixed horizon algorithm; reinforcement learning; technical process control; value iteration method; Chemical reactors; Control systems; Convergence; Cost function; Dynamic programming; Electronic mail; Learning; Optimal control; Process control; Temperature control;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks Proceedings, 1998. IEEE World Congress on Computational Intelligence. The 1998 IEEE International Joint Conference on
  • Conference_Location
    Anchorage, AK
  • ISSN
    1098-7576
  • Print_ISBN
    0-7803-4859-1
  • Type

    conf

  • DOI
    10.1109/IJCNN.1998.687166
  • Filename
    687166