Reinforcement learning algorithms for solving classification problems

Author

Wiering, Marco A. ; Van Hasselt, Hado ; Pietersma, Auke-Dirk ; Schomaker, Lambert

Author_Institution

Dept. of Artificial Intell., Univ. of Groningen, Groningen, Netherlands

fYear

2011

fDate

11-15 April 2011

Firstpage

91

Lastpage

96

Abstract

We describe a new framework for applying reinforcement learning (RL) algorithms to solve classification tasks by letting an agent act on the inputs and learn value functions. This paper describes how classification problems can be modeled using classification Markov decision processes and introduces the Max-Min ACLA algorithm, an extension of the novel RL algorithm called actor-critic learning automaton (ACLA). Experiments are performed using 8 datasets from the UCI repository, where our RL method is combined with multi-layer perceptrons that serve as function approximators. The RL method is compared to conventional multi-layer perceptrons and support vector machines and the results show that our method slightly outperforms the multi-layer perceptron and performs equally well as the support vector machine. Finally, many possible extensions are described to our basic method, so that much future research can be done to make the proposed method even better.

Keywords

Markov processes; function approximation; learning (artificial intelligence); minimax techniques; multi-agent systems; multilayer perceptrons; pattern classification; support vector machines; RL algorithm; actor-critic learning automaton; classification Markov decision process; classification problem; function approximator; max-min ACLA algorithm; multilayer perceptrons; reinforcement learning; support vector machine; Accuracy; Artificial neural networks; Learning; Markov processes; Support vector machines; Testing; Training;

fLanguage

English

Publisher

ieee

Conference_Titel

Adaptive Dynamic Programming And Reinforcement Learning (ADPRL), 2011 IEEE Symposium on

Conference_Location

Paris

Print_ISBN

978-1-4244-9887-1

Type

conf

DOI

10.1109/ADPRL.2011.5967372

Filename

5967372