Title :
Safe inclusion of information about rates of variation in a reinforcement learning algorithm
Author :
Ribeiro, Carlos H C
Author_Institution :
Div. de Engenharia Eletronica, Inst. Tecnologico de Aeronaut., Sao Paulo, Brazil
Abstract :
There is a need to enhance reinforcement learning techniques by using prior knowledge built into the agent at its inception. The information crudeness upon which those algorithms operate may be interesting from a theoretical point of view, but large scale problems are made too difficult and unrealistic by considering the learning agent as a `tabula rasa´. None the less, knowledge must be embedded in such a way that the structural, well-studied characteristics of the fundamental algorithms are maintained. A more general formulation of a classical reinforcement learning method is investigated. It allows for a spreading of information derived from single updates towards a neighbourhood of the instantly visited state, and converges to optimality. We show how this new formulation can be used as a mechanism to safely embed prior knowledge about expected rates of variation of action values, and practical studies demonstrate an application of the proposed algorithm
Keywords :
approximation theory; convergence; intelligent control; knowledge representation; learning (artificial intelligence); state estimation; Q learning; convergence; knowledge representation; learning agent; learning control; reinforcement learning; safe inclusion; state estimation; Cost function; Dynamic programming; Electronic switching systems; Learning systems; Read only memory; Stochastic processes; Table lookup;
Conference_Titel :
Neural Networks, 1998. Proceedings. Vth Brazilian Symposium on
Conference_Location :
Belo Horizonte
Print_ISBN :
0-8186-8629-4
DOI :
10.1109/SBRN.1998.730985