DocumentCode :
2312791
Title :
An Investigation on Linear SVM and its Variants for Text Categorization
Author :
Kumar, M. Arun ; Gopal, M.
Author_Institution :
Controls & Optimization Res., ABB Global Ind. & Services Ltd., Bangalore, India
fYear :
2010
fDate :
9-11 Feb. 2010
Firstpage :
27
Lastpage :
31
Abstract :
Linear Support Vector Machines (SVMs) have been used successfully to classify text documents into set of concepts. With the increasing number of linear SVM formulations and decomposition algorithms publicly available, this paper performs a study on their efficiency and efficacy for text categorization tasks. Eight publicly available implementations are investigated in terms of Break Even Point (BEP), F1 measure, ROC plots, learning speed and sensitivity to penalty parameter, based on the experimental results on two benchmark text corpuses. The results show that out of the eight implementations, SVMlin and Proximal SVM perform better in terms of consistent performance and reduced training time. However being an extremely simple algorithm with training time independent of the penalty parameter and the category for which training is being done, Proximal SVM is appealing. We further investigated fuzzy proximal SVM on both the text corpuses; it showed improved generalization over proximal SVM.
Keywords :
support vector machines; text analysis; break even point; linear SVM; penalty parameter; support vector machines; text categorization; text documents; Computer industry; Fuzzy sets; Industrial control; Machine learning; Space technology; Support vector machine classification; Support vector machines; Testing; Text categorization; Velocity measurement; Fuzzy Proximal SVM; Proximal SVM; Support vector machines; Text categorization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Computing (ICMLC), 2010 Second International Conference on
Conference_Location :
Bangalore
Print_ISBN :
978-1-4244-6006-9
Electronic_ISBN :
978-1-4244-6007-6
Type :
conf
DOI :
10.1109/ICMLC.2010.64
Filename :
5460696
Link To Document :
بازگشت