DocumentCode
2626267
Title
An Unsupervised Data-Driven Cross-Lingual Method for Building High Precision Sentiment Lexicons
Author
Sangiorgi, Pierluca ; Augello, Agnese ; Pilato, Giovanni
Author_Institution
ICAR (Ist. di Calcolo e Reti ad Alte Prestazioni), Palermo, Italy
fYear
2013
fDate
16-18 Sept. 2013
Firstpage
184
Lastpage
190
Abstract
In this paper we present a completely unsupervised approach for creating a sentiment lexicon. The approach has been realized by designing a pipeline which implements an unsupervised system that covers different aspects: the automatic extraction of user reviews, the pre-processing of text, the use of a scoring measure which combines: entropy, term frequency, inverse document frequency, and finally a cross lingual intersection. We have validated the approach though the analysis of a previews present in the Google Play market. The results show the effectiveness of the approach given by satisfactory values of precision for the obtained lexicon.
Keywords
computational linguistics; entropy; information retrieval; text analysis; unsupervised learning; Google Play market; cross lingual intersection; entropy; high precision sentiment lexicons; inverse document frequency; scoring measure; term frequency; text preprocessing; unsupervised data-driven cross-lingual method; unsupervised system; user reviews automatic extraction; Buildings; Dictionaries; Entropy; Frequency measurement; Google; Pipelines; Pragmatics; Machine Learning; Sentiment Analysis; Sentiment Lexicon;
fLanguage
English
Publisher
ieee
Conference_Titel
Semantic Computing (ICSC), 2013 IEEE Seventh International Conference on
Conference_Location
Irvine, CA
Type
conf
DOI
10.1109/ICSC.2013.40
Filename
6693515
Link To Document