DocumentCode :
2711049
Title :
Document-Word Co-regularization for Semi-supervised Sentiment Analysis
Author :
Sindhwani, Vikas ; Melville, Prem
Author_Institution :
IBM T. J. Watson Res. Center, Yorktown Heights, NY
fYear :
2008
fDate :
15-19 Dec. 2008
Firstpage :
1025
Lastpage :
1030
Abstract :
The goal of sentiment prediction is to automatically identify whether a given piece of text expresses positive or negative opinion towards a topic of interest. One can pose sentiment prediction as a standard text categorization problem, but gathering labeled data turns out to be a bottleneck. Fortunately, background knowledge is often available in the form of prior information about the sentiment polarity of words in a lexicon. Moreover, in many applications abundant unlabeled data is also available. In this paper, we propose a novel semi-supervised sentiment prediction algorithm that utilizes lexical prior knowledge in conjunction with unlabeled examples. Our method is based on joint sentiment analysis of documents and words based on a bipartite graph representation of the data. We present an empirical study on a diverse collection of sentiment prediction problems which confirms that our semi-supervised lexical models significantly outperform purely supervised and competing semi-supervised techniques.
Keywords :
graph theory; least squares approximations; text analysis; word processing; bipartite graph representation; document joint sentiment analysis; document-word co-regularization; prediction algorithm; semi supervised sentiment prediction algorithm; standard regularized least square; text categorization problem; Blogs; Data mining; Discussion forums; Frequency; Machine learning; Motion pictures; Prediction algorithms; Text analysis; Text categorization; Vectors; Graph Transduction; Linear models; Semi-supervised Learning; Sentiment Analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2008. ICDM '08. Eighth IEEE International Conference on
Conference_Location :
Pisa
ISSN :
1550-4786
Print_ISBN :
978-0-7695-3502-9
Type :
conf
DOI :
10.1109/ICDM.2008.113
Filename :
4781219
Link To Document :
بازگشت