DocumentCode :
1849598
Title :
Experimental analysis of new algorithms for learning ternary classifiers
Author :
Zucker, Jean-Daniel ; Chevaleyre, Yann ; Van Sang, Dao
Author_Institution :
IRD France Nord, UMMISCO, Bondy, France
fYear :
2015
fDate :
25-28 Jan. 2015
Firstpage :
19
Lastpage :
24
Abstract :
Discrete linear classifier is a very sparse class of decision model that has proved useful to reduce overfitting in very high dimension learning problems. However, learning discrete linear classifier is known as a difficult problem. It requires finding a discrete linear model minimizing the classification error over a given sample. A ternary classifier is a classifier defined by a pair (w, r) where w is a vector in {-1, 0, +1}n and r is a nonnegative real capturing the threshold or offset. The goal of the learning algorithm is to find a vector of weights in {-1, 0, +1}n that minimizes the hinge loss of the linear model from the training data. This problem is NP-hard and one approach consists in exactly solving the relaxed continuous problem and to heuristically derive discrete solutions. A recent paper by the authors has introduced a randomized rounding algorithm [1] and we propose in this paper more sophisticated algorithms that improve the generalization error. These algorithms are presented and their performances are experimentally analyzed. Our results show that this kind of compact model can address the complex problem of learning predictors from bioinformatics data such as metagenomics ones where the size of samples is much smaller than the number of attributes. The new algorithms presented improve the state of the art algorithm to learn ternary classifier. The source of power of this improvement is done at the expense of time complexity.
Keywords :
bioinformatics; computational complexity; generalisation (artificial intelligence); learning (artificial intelligence); pattern classification; vectors; NP-hard; bioinformatics data; classification error minimization; decision model; discrete linear classifier learning; generalization error; metagenomics; randomized rounding algorithm; ternary classifier learning; time complexity; vector; Algorithm design and analysis; Classification algorithms; Data models; Error analysis; Fasteners; Prediction algorithms; Vectors; Metagenomics data; Randomized Rounding; Ternary Classifier;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computing & Communication Technologies - Research, Innovation, and Vision for the Future (RIVF), 2015 IEEE RIVF International Conference on
Conference_Location :
Can Tho
Print_ISBN :
978-1-4799-8043-7
Type :
conf
DOI :
10.1109/RIVF.2015.7049868
Filename :
7049868
Link To Document :
بازگشت