Title of article

BUILDING A MODEL TO PREDICT CLASSIFIER ACCURACY

Author/Authors

Aubakirov, S.S. al-Farabi Kazakh National university, Almaty, Kazakhstan , Trigo, P. Instituto Superior de Engenharia de Lisboa Biosystems and Integrative Sciences - Institute Agent and Systems Modeling, Lisbon, Portugal , Ahmed-Zaki, D. Zh. al-Farabi Kazakh National university, Almaty, Kazakhstan

Pages

From page

To page

Abstract

In this paper, we propose an optimization workflow to predict classifiers accuracy based on the exploration of the space composed of different data features and the configu- rations of the classification algorithms. The overall process is described considering the text classification problem. We take three main features that affect text classification and there- fore the accuracy of classifiers. The first feature considers the words that comprise the input text; here we use the N-gram concept with different N values. The second feature considers the adoption of textual pre-processing steps such as the stop-word filtering and stemming techniques. The third feature considers the classification algorithms hyperparameters. In this paper, we take the well-known classifiers K-Nearest Neighbors (KNN) and Naive Bayes (NB) where K (from KNN) and a-priori probabilities (from NB) are hyperparameters that influence accuracy. As a result, we explore the feature space (correlation among textual and classifier aspects) and we present an approximation model that is able to predict classifiers accuracy.

Keywords

text classification , learning algorithms , genetic algorithm , distributed computing

Journal title

Eurasian Journal of Mathematical and Computer Applications

Serial Year

2017

Full Text URL

drive.google.com/file/d/0B2YBOGP6HM8vTWlnaG9Hb09ZSjd0cTUxdnlqM05VeXYtZm5F/view

Record number

2601658

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=10&DC=2601658