• DocumentCode
    2711049
  • Title

    Document-Word Co-regularization for Semi-supervised Sentiment Analysis

  • Author

    Sindhwani, Vikas ; Melville, Prem

  • Author_Institution
    IBM T. J. Watson Res. Center, Yorktown Heights, NY
  • fYear
    2008
  • fDate
    15-19 Dec. 2008
  • Firstpage
    1025
  • Lastpage
    1030
  • Abstract
    The goal of sentiment prediction is to automatically identify whether a given piece of text expresses positive or negative opinion towards a topic of interest. One can pose sentiment prediction as a standard text categorization problem, but gathering labeled data turns out to be a bottleneck. Fortunately, background knowledge is often available in the form of prior information about the sentiment polarity of words in a lexicon. Moreover, in many applications abundant unlabeled data is also available. In this paper, we propose a novel semi-supervised sentiment prediction algorithm that utilizes lexical prior knowledge in conjunction with unlabeled examples. Our method is based on joint sentiment analysis of documents and words based on a bipartite graph representation of the data. We present an empirical study on a diverse collection of sentiment prediction problems which confirms that our semi-supervised lexical models significantly outperform purely supervised and competing semi-supervised techniques.
  • Keywords
    graph theory; least squares approximations; text analysis; word processing; bipartite graph representation; document joint sentiment analysis; document-word co-regularization; prediction algorithm; semi supervised sentiment prediction algorithm; standard regularized least square; text categorization problem; Blogs; Data mining; Discussion forums; Frequency; Machine learning; Motion pictures; Prediction algorithms; Text analysis; Text categorization; Vectors; Graph Transduction; Linear models; Semi-supervised Learning; Sentiment Analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2008. ICDM '08. Eighth IEEE International Conference on
  • Conference_Location
    Pisa
  • ISSN
    1550-4786
  • Print_ISBN
    978-0-7695-3502-9
  • Type

    conf

  • DOI
    10.1109/ICDM.2008.113
  • Filename
    4781219