Title :
A comparison of class-balance strategies for SVM in the problem of protein function prediction
Author :
Luis Roberto Mercado-Díaz;Julián Navarro-García;Jorge Alberto Jaramillo-Garzón
Author_Institution :
Grupo de Automá
Abstract :
This paper presents a comparison of three strategies for managing the imbalance problem: undersampling, SMOTE and Weighted SVM. Undersampling is a strategy where the samples of the majority class are discarded; SMOTE (Synthetic Minority Over-sampling Technique) is a method in which synthetic samples of the minority class are added to the dataset; Weighted SVM keeps the number of samples of each class but assigns weights in the training of the SVM, so it can improve its performance for the minority class. Results show that Weighted SVM and SMOTE achieved comparable results, although Weighted SVM requires less computational effort. Undersampling, on the other hand, achieved lower performance results presumably due to the loss of information produced when discarding data.
Keywords :
"Support vector machines","Sensitivity","Yttrium","Bioinformatics"
Conference_Titel :
Signal Processing, Images and Computer Vision (STSIVA), 2015 20th Symposium on
DOI :
10.1109/STSIVA.2015.7330418