DocumentCode
396653
Title
Competitive reinforcement learning in continuous control tasks
Author
Abramson, Myriam ; Pachowicz, Peter ; Wechsler, Harry
Author_Institution
George Mason Univ., Fairfax, VA, USA
Volume
3
fYear
2003
fDate
20-24 July 2003
Firstpage
1909
Abstract
This paper describes a novel hybrid reinforcement learning algorithm, Sarsa Learning Vector Quantization (SLVQ), that leaves the reinforcement part intact but employs a more effective representation of the policy function using a piecewise constant function based upon "policy prototypes". The prototypes correspond to the pattern classes induced by the Voronoi tessellation generated by self-organizing methods like Learning Vector Quantization (LVQ). The determination of the optimal policy function can be now viewed as a pattern recognition problem in the sense that the assignment of an action to a point in the phase space is similar to the assignment of a pattern class to a point in phase space. The distributed LVQ representation of the policy function automatically generates a piecewise constant tessellation of the state space and yields in a major simplification of the learning task relative to the standard reinforcement learning algorithms for whom a discontinuous table look function has to be learned. The feasibility and comparative advantages of the new algorithm is shown on the cart centering and mountain car problems, two control problems of increased difficulty.
Keywords
learning (artificial intelligence); self-organising feature maps; vector quantisation; Sarsa learning vector quantization; Voronoi tessellation; cart centering problem; continuous control task; distributed LVQ; hybrid reinforcement learning algorithm; mountain car problem; optimal policy function; pattern recognition problem; piecewise constant function; piecewise constant tessellation; policy prototypes; self-organizing methods; Computer science; Design engineering; Fires; Hybrid power systems; Learning; Pattern recognition; Prototypes; State-space methods; Temperature control; Vector quantization;
fLanguage
English
Publisher
ieee
Conference_Titel
Neural Networks, 2003. Proceedings of the International Joint Conference on
ISSN
1098-7576
Print_ISBN
0-7803-7898-9
Type
conf
DOI
10.1109/IJCNN.2003.1223699
Filename
1223699
Link To Document