Title :
Investigating active learning techniques for document level sentiment classification of tweets
Author :
Kumar, Ayush ; Kansal, Chaitanya ; Ekbal, Asif
Author_Institution :
Dept. of Comput. Sci. & Eng., Indian Inst. of Technol. Patna, Patna, India
Abstract :
Active Learning is a technique to automatically select the useful instances from the unlabelled data in such a way that, when these are augmented to the training data, overall classification performance improves. The creation of training examples otherwise involves significant amount of costs and efforts and hence, is a major constraint in the supervised algorithms. In this paper, we investigate the effectiveness of active learning for sentiment classification of Tweets. The algorithm selects the informative unlabelled data based on the concept of uncertainty sampling which dictates that only those Tweets be added to the training set for which the classifier can quickly refine its decision boundary. Our experiments on a benchmark dataset of Tweets show an overall accuracy of 83.95%, which is an increment of 6.75% over the baseline model, constructed by training a Support Vector Machine (SVM) with all the available set of features. The approach, being very general, is scalable, domain-adaptable and easy to implement for a wide variety of problems.
Keywords :
emotion recognition; learning (artificial intelligence); pattern classification; social networking (online); support vector machines; text analysis; SVM; Tweet sentiment classification; active learning techniques; decision boundary; document level sentiment classification; informative unlabelled data; supervised algorithms; support vector machine; uncertainty sampling; Accuracy; Learning systems; Logic gates; Silicon;
Conference_Titel :
Communication Systems and Networks (COMSNETS), 2015 7th International Conference on
Conference_Location :
Bangalore
DOI :
10.1109/COMSNETS.2015.7098727