A modification of gradient policy in reinforcement learning procedure

Author

Abas, Marcel ; Skripcak, Tomas

Author_Institution

Inst. of Appl. Inf., Autom. & Math., Slovak Univ. of Technol. in Bratislava, Trnava, Slovakia

fYear

2012

fDate

26-28 Sept. 2012

Firstpage

1

Lastpage

2

Abstract

The gradient of a scalar function is frequently used in various areas of mathematics. In informatics it can be used, for example, in the process of learning procedure of many control systems. The key observation is that gradient, if it is a non-zero vector, is a vector in the direction of greatest rate of the scalar function. In this contribution we show a method how to determine the direction(s) even if the gradient is zero vector. We show that this can be done with the knowledge which students have it their stage of study.

Keywords

gradient methods; intelligent robots; learning (artificial intelligence); vectors; control system; gradient policy modification; informatics; mathematics; nonzero vector gradient; reinforcement learning procedure; robot learning; scalar function gradient; zero vector gradient; Automation; Control systems; Educational institutions; Informatics; Learning; Robots; Vectors; control system; direction of greatest rate; gradient policy; neuron networks;

fLanguage

English

Publisher

ieee

Conference_Titel

Interactive Collaborative Learning (ICL), 2012 15th International Conference on

Conference_Location

Villach

Print_ISBN

978-1-4673-2425-0

Electronic_ISBN

978-1-4673-2426-7

Type

conf

DOI

10.1109/ICL.2012.6402200

Filename

6402200