Title :
Probabilistic model-based sentiment analysis of twitter messages
Author :
Celikyilmaz, Asli ; Hakkani-Tür, Dilek ; Feng, Junlan
Author_Institution :
Univ. of California, Berkeley, CA, USA
Abstract :
We present a machine learning approach to sentiment classification on twitter messages (tweets). We classify each tweet into two categories: polar and non-polar. Tweets with positive or negative sentiment are considered polar. They are considered non-polar otherwise. Sentiment analysis of tweets can potentially benefit different parties, such as consumers and marketing researchers, for obtaining opinions on different products and services. We present methods for text normalization of the noisy tweets and their classification with respect to the polarity. We experiment with a mixture model approach for generation of sentimental words, which are later used as indicator features of the classification model. Based on a gold standard manually annotated ensemble of tweets, with the new approach, we obtain F-scores that are relatively 10% better than a classification baseline that uses raw word n-gram features.
Keywords :
behavioural sciences computing; feature extraction; learning (artificial intelligence); pattern classification; probability; social networking (online); text analysis; F-scores; machine learning; mixture model approach; negative sentiment; noisy tweets; nonpolar tweet; polar tweet; probabilistic model based sentiment analysis; raw word n-gram features; sentiment classification; sentimental words; text normalization; twitter messages; Sentiment analysis; Twitter; feature extraction; micro-blogs; probabilistic graphical models;
Conference_Titel :
Spoken Language Technology Workshop (SLT), 2010 IEEE
Conference_Location :
Berkeley, CA
Print_ISBN :
978-1-4244-7904-7
Electronic_ISBN :
978-1-4244-7902-3
DOI :
10.1109/SLT.2010.5700826