Title :
Effect of training set size on SVM and Naive Bayes for Twitter sentiment analysis
Author :
Omar Abdelwahab;Mohamed Bahgat;Christopher J. Lowrance;Adel Elmaghraby
Author_Institution :
Computer Engineering and Computer Science, University of Louisville, Louisville, KY, USA
Abstract :
Twitter sentiment analysis has become an effective way in measuring public sentiment about a certain topic or product. Thus, researchers have worked extensively in recent years to build efficient models for sentiment classification. In this paper, we will measure the effect of varying the training set size on the classification accuracy and F-score of SVM and Naive Bayes classifiers. We will expand our study even further by forming two ensembles: Ensemble 1 and Ensemble 2. Both ensembles include a single Naive Bayes and SVM classifier, but the ensembles differ in terms of the decision fusion technique utilized. Ensemble 1 uses `AND-type´ fusion while Ensemble 2 uses `OR-type´ fusion. In this paper, we measure the effect of training set size on each ensemble configuration type by measuring their F-scores and classification accuracies while varying the training set size.
Keywords :
"Training","Support vector machines","Sentiment analysis","Size measurement","Companies","Training data","Libraries"
Conference_Titel :
Signal Processing and Information Technology (ISSPIT), 2015 IEEE International Symposium on
DOI :
10.1109/ISSPIT.2015.7394379