• DocumentCode
    708161
  • Title

    Multi-way sentiment classification of Arabic reviews

  • Author

    Al Shboul, Bashar ; Al-Ayyouby, Mahmoud ; Jararwehy, Yaser

  • Author_Institution
    Carleton Univ., Ottawa, ON, Canada
  • fYear
    2015
  • fDate
    7-9 April 2015
  • Firstpage
    206
  • Lastpage
    211
  • Abstract
    The evolution of the Web and the appearance of new technologies led to the rise of new ways for the Internet users to express their opinions and feelings regarding different aspects of life. Such expressions are written in an unstructured way using natural languages. They hold a great deal of knowledge about the user´s opinions and reactions on various subjects. As a result, a new field called Sentiment Analysis (SA) has come into existence to address the complicated task of extracting such opinions or sentiments from the massive pool of unstructured text available online. Traditional works on SA consider only two sentiments: positive and negative. Multi-way SA sentiment analysis consider sentiments expressed using a star or ranking system. E.g., in a 5-star ranking system, the user´s opinion ranges from very negative (1 star) to very positive (5 stars). This version of SA is obviously much harder to handle which partly explains the limited number of works on it. Moreover, we focus in this work on the Arabic language, which is largely understudied compared to the English language. In this work, a new and relatively large Arabic dataset is used. The dataset, called the Large Arabic Book Reviews (LABR) dataset, is gathered from an online book reviews website. The objective of this work is to perform baseline experiments on this dataset by applying the Bag-Of-Words words coupled with the most popular classifiers. We also investigate the effect of stemming and balancing the dataset. The obtained accuracies are low confirming the intuition that the multi-way SA problem is very difficult and needs further attention.
  • Keywords
    Web sites; classification; data mining; natural language processing; social sciences computing; text analysis; Arabic language; Internet users; LABR dataset; Web evolution; bag-of-words; dataset balancing; dataset stemming; large Arabic book reviews dataset; multiway SA problem; multiway sentiment classification; natural languages; negative sentiments; online book reviews Website; opinion mining; positive sentiments; sentiment analysis; star ranking system; unstructured text; user opinions; user reactions; Accuracy; Internet; Market research; Niobium; Sentiment analysis; Support vector machines; Arabic Text; Bag-Of-Words; Decision Tree; K-Nearest Neighbor; Multi-Way Sentiment Analysis; Naive Bayes; Voting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information and Communication Systems (ICICS), 2015 6th International Conference on
  • Conference_Location
    Amman
  • Type

    conf

  • DOI
    10.1109/IACS.2015.7103228
  • Filename
    7103228