DocumentCode
2498428
Title
Reinforcement learning algorithms for solving classification problems
Author
Wiering, Marco A. ; Van Hasselt, Hado ; Pietersma, Auke-Dirk ; Schomaker, Lambert
Author_Institution
Dept. of Artificial Intell., Univ. of Groningen, Groningen, Netherlands
fYear
2011
fDate
11-15 April 2011
Firstpage
91
Lastpage
96
Abstract
We describe a new framework for applying reinforcement learning (RL) algorithms to solve classification tasks by letting an agent act on the inputs and learn value functions. This paper describes how classification problems can be modeled using classification Markov decision processes and introduces the Max-Min ACLA algorithm, an extension of the novel RL algorithm called actor-critic learning automaton (ACLA). Experiments are performed using 8 datasets from the UCI repository, where our RL method is combined with multi-layer perceptrons that serve as function approximators. The RL method is compared to conventional multi-layer perceptrons and support vector machines and the results show that our method slightly outperforms the multi-layer perceptron and performs equally well as the support vector machine. Finally, many possible extensions are described to our basic method, so that much future research can be done to make the proposed method even better.
Keywords
Markov processes; function approximation; learning (artificial intelligence); minimax techniques; multi-agent systems; multilayer perceptrons; pattern classification; support vector machines; RL algorithm; actor-critic learning automaton; classification Markov decision process; classification problem; function approximator; max-min ACLA algorithm; multilayer perceptrons; reinforcement learning; support vector machine; Accuracy; Artificial neural networks; Learning; Markov processes; Support vector machines; Testing; Training;
fLanguage
English
Publisher
ieee
Conference_Titel
Adaptive Dynamic Programming And Reinforcement Learning (ADPRL), 2011 IEEE Symposium on
Conference_Location
Paris
Print_ISBN
978-1-4244-9887-1
Type
conf
DOI
10.1109/ADPRL.2011.5967372
Filename
5967372
Link To Document