DocumentCode :
2706947
Title :
Constructing Chinese domain lexicon with improved entropy formula for sentiment analysis
Author :
Zhang, Jing ; Peng, Qinke
Author_Institution :
Sch. of Electron. & Inf. Eng., Xi´´an Jiaotong Univ., Xi´´an, China
fYear :
2012
fDate :
6-8 June 2012
Firstpage :
850
Lastpage :
855
Abstract :
Sentiment analysis can promptly find the public attitudes and mental states to provide basis for the decision-making. Most previous work either use sentiment lexicon as sentiment orientation of words for text classification or rely on labeled corpora to train a sentiment classifier. However, sentiment is expressed differently in different domains, and cyber words are exploded with the extensive use of Internet. Traditional methods could not adapt to them well. In this paper, we introduce a method to calculate sentiment scores of words and phrases in different domains for sentiment analysis. This method not only makes up the defect that existing lexicons couldn\´t adapt to different domains and cyber words, but also avoids high cost of manual annotation in corpus-based approaches. Meanwhile, in order to avoid the high dimensions and sparseness in "bag of words" method, we calculate weighted sum of sentiment score in a review for classification. The results of experiments show that our method is available for sentiment classification on movie reviews and stock reviews.
Keywords :
Internet; natural language processing; pattern classification; text analysis; Chinese domain lexicon; Internet; cyber words; decision-making; improved entropy formula; labeled corpora; sentiment analysis; sentiment classifier; sentiment orientation; text classification; Accuracy; Educational institutions; Indexes; Internet; Motion pictures; Semantics; Testing; cyber words; domain lexicon; sentiment analysis; sentiment score;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information and Automation (ICIA), 2012 International Conference on
Conference_Location :
Shenyang
Print_ISBN :
978-1-4673-2238-6
Electronic_ISBN :
978-1-4673-2236-2
Type :
conf
DOI :
10.1109/ICInfA.2012.6246900
Filename :
6246900
Link To Document :
بازگشت