DocumentCode :
396653
Title :
Competitive reinforcement learning in continuous control tasks
Author :
Abramson, Myriam ; Pachowicz, Peter ; Wechsler, Harry
Author_Institution :
George Mason Univ., Fairfax, VA, USA
Volume :
3
fYear :
2003
fDate :
20-24 July 2003
Firstpage :
1909
Abstract :
This paper describes a novel hybrid reinforcement learning algorithm, Sarsa Learning Vector Quantization (SLVQ), that leaves the reinforcement part intact but employs a more effective representation of the policy function using a piecewise constant function based upon "policy prototypes". The prototypes correspond to the pattern classes induced by the Voronoi tessellation generated by self-organizing methods like Learning Vector Quantization (LVQ). The determination of the optimal policy function can be now viewed as a pattern recognition problem in the sense that the assignment of an action to a point in the phase space is similar to the assignment of a pattern class to a point in phase space. The distributed LVQ representation of the policy function automatically generates a piecewise constant tessellation of the state space and yields in a major simplification of the learning task relative to the standard reinforcement learning algorithms for whom a discontinuous table look function has to be learned. The feasibility and comparative advantages of the new algorithm is shown on the cart centering and mountain car problems, two control problems of increased difficulty.
Keywords :
learning (artificial intelligence); self-organising feature maps; vector quantisation; Sarsa learning vector quantization; Voronoi tessellation; cart centering problem; continuous control task; distributed LVQ; hybrid reinforcement learning algorithm; mountain car problem; optimal policy function; pattern recognition problem; piecewise constant function; piecewise constant tessellation; policy prototypes; self-organizing methods; Computer science; Design engineering; Fires; Hybrid power systems; Learning; Pattern recognition; Prototypes; State-space methods; Temperature control; Vector quantization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks, 2003. Proceedings of the International Joint Conference on
ISSN :
1098-7576
Print_ISBN :
0-7803-7898-9
Type :
conf
DOI :
10.1109/IJCNN.2003.1223699
Filename :
1223699
Link To Document :
بازگشت