• DocumentCode
    2626267
  • Title

    An Unsupervised Data-Driven Cross-Lingual Method for Building High Precision Sentiment Lexicons

  • Author

    Sangiorgi, Pierluca ; Augello, Agnese ; Pilato, Giovanni

  • Author_Institution
    ICAR (Ist. di Calcolo e Reti ad Alte Prestazioni), Palermo, Italy
  • fYear
    2013
  • fDate
    16-18 Sept. 2013
  • Firstpage
    184
  • Lastpage
    190
  • Abstract
    In this paper we present a completely unsupervised approach for creating a sentiment lexicon. The approach has been realized by designing a pipeline which implements an unsupervised system that covers different aspects: the automatic extraction of user reviews, the pre-processing of text, the use of a scoring measure which combines: entropy, term frequency, inverse document frequency, and finally a cross lingual intersection. We have validated the approach though the analysis of a previews present in the Google Play market. The results show the effectiveness of the approach given by satisfactory values of precision for the obtained lexicon.
  • Keywords
    computational linguistics; entropy; information retrieval; text analysis; unsupervised learning; Google Play market; cross lingual intersection; entropy; high precision sentiment lexicons; inverse document frequency; scoring measure; term frequency; text preprocessing; unsupervised data-driven cross-lingual method; unsupervised system; user reviews automatic extraction; Buildings; Dictionaries; Entropy; Frequency measurement; Google; Pipelines; Pragmatics; Machine Learning; Sentiment Analysis; Sentiment Lexicon;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Semantic Computing (ICSC), 2013 IEEE Seventh International Conference on
  • Conference_Location
    Irvine, CA
  • Type

    conf

  • DOI
    10.1109/ICSC.2013.40
  • Filename
    6693515