• DocumentCode
    587314
  • Title

    A modification of gradient policy in reinforcement learning procedure

  • Author

    Abas, Marcel ; Skripcak, Tomas

  • Author_Institution
    Inst. of Appl. Inf., Autom. & Math., Slovak Univ. of Technol. in Bratislava, Trnava, Slovakia
  • fYear
    2012
  • fDate
    26-28 Sept. 2012
  • Firstpage
    1
  • Lastpage
    2
  • Abstract
    The gradient of a scalar function is frequently used in various areas of mathematics. In informatics it can be used, for example, in the process of learning procedure of many control systems. The key observation is that gradient, if it is a non-zero vector, is a vector in the direction of greatest rate of the scalar function. In this contribution we show a method how to determine the direction(s) even if the gradient is zero vector. We show that this can be done with the knowledge which students have it their stage of study.
  • Keywords
    gradient methods; intelligent robots; learning (artificial intelligence); vectors; control system; gradient policy modification; informatics; mathematics; nonzero vector gradient; reinforcement learning procedure; robot learning; scalar function gradient; zero vector gradient; Automation; Control systems; Educational institutions; Informatics; Learning; Robots; Vectors; control system; direction of greatest rate; gradient policy; neuron networks;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Interactive Collaborative Learning (ICL), 2012 15th International Conference on
  • Conference_Location
    Villach
  • Print_ISBN
    978-1-4673-2425-0
  • Electronic_ISBN
    978-1-4673-2426-7
  • Type

    conf

  • DOI
    10.1109/ICL.2012.6402200
  • Filename
    6402200