Title :
Constructing Chinese domain lexicon with improved entropy formula for sentiment analysis
Author :
Zhang, Jing ; Peng, Qinke
Author_Institution :
Sch. of Electron. & Inf. Eng., Xi´´an Jiaotong Univ., Xi´´an, China
Abstract :
Sentiment analysis can promptly find the public attitudes and mental states to provide basis for the decision-making. Most previous work either use sentiment lexicon as sentiment orientation of words for text classification or rely on labeled corpora to train a sentiment classifier. However, sentiment is expressed differently in different domains, and cyber words are exploded with the extensive use of Internet. Traditional methods could not adapt to them well. In this paper, we introduce a method to calculate sentiment scores of words and phrases in different domains for sentiment analysis. This method not only makes up the defect that existing lexicons couldn\´t adapt to different domains and cyber words, but also avoids high cost of manual annotation in corpus-based approaches. Meanwhile, in order to avoid the high dimensions and sparseness in "bag of words" method, we calculate weighted sum of sentiment score in a review for classification. The results of experiments show that our method is available for sentiment classification on movie reviews and stock reviews.
Keywords :
Internet; natural language processing; pattern classification; text analysis; Chinese domain lexicon; Internet; cyber words; decision-making; improved entropy formula; labeled corpora; sentiment analysis; sentiment classifier; sentiment orientation; text classification; Accuracy; Educational institutions; Indexes; Internet; Motion pictures; Semantics; Testing; cyber words; domain lexicon; sentiment analysis; sentiment score;
Conference_Titel :
Information and Automation (ICIA), 2012 International Conference on
Conference_Location :
Shenyang
Print_ISBN :
978-1-4673-2238-6
Electronic_ISBN :
978-1-4673-2236-2
DOI :
10.1109/ICInfA.2012.6246900