• DocumentCode
    3452282
  • Title

    A non-parametric LDA-based induction method for sentiment analysis

  • Author

    Shams, Mohammadreza ; Shakery, Azadeh ; Faili, Heshaam

  • Author_Institution
    Sch. of Electr. & Comput. Eng., Univ. of Tehran, Tehran, Iran
  • fYear
    2012
  • fDate
    2-3 May 2012
  • Firstpage
    216
  • Lastpage
    221
  • Abstract
    Sentiment analysis is the process of analyzing the characteristics of opinions, feelings, and emotions which are expressed in textual data. This paper presents a novel approach for generation of a lexical resource named PersianClues used for sentiment analysis in Persian language. Moreover, a novel unsupervised LDA-based sentiment analysis method called LDASA is proposed. In order to generate the PersianClues, at the first phase, an automatic translation approach is used to translate the existing English clues to Persian. Next, iterative refinement approach is used to correct the erroneous clues resulted from previous step. Then, topic-based polar sets are achieved from these clues and finally, each document is categorized into its related polarity using a classification algorithm. To evaluate this method, three resources about hotels, cell phones and digital cameras have been manually gathered from the e-shopping websites and the results of sentiment analysis on these resources are compared with a baseline named SVM-Unigrams. The experimental results demonstrate an improvement of 9% on average in polarity classification accuracy of the base system.
  • Keywords
    iterative methods; language translation; natural language processing; pattern classification; support vector machines; text analysis; English clues; LDA-based sentiment analysis method; LDASA; Persian language; PersianClues; SVM-Unigrams; automatic translation approach; cell phones; classification algorithm; digital cameras; e-shopping Websites; hotels; iterative refinement approach; lexical resource; nonparametric LDA-based induction method; polarity classification accuracy; textual data; topic-based polar sets; Accuracy; Algorithm design and analysis; Cellular phones; Classification algorithms; Educational institutions; Mathematical model; Support vector machines; LDA-based sentiment analysis; Latent Dirichlet allocation; opinion mining; polarity classification; sentiment analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Artificial Intelligence and Signal Processing (AISP), 2012 16th CSI International Symposium on
  • Conference_Location
    Shiraz, Fars
  • Print_ISBN
    978-1-4673-1478-7
  • Type

    conf

  • DOI
    10.1109/AISP.2012.6313747
  • Filename
    6313747