• DocumentCode
    2950625
  • Title

    Arabic sentiment analysis: Lexicon-based and corpus-based

  • Author

    Abdulla, Nawaf A. ; Ahmed, Nizar A. ; Shehab, Mohammad A. ; Al-Ayyoub, Mahmoud

  • Author_Institution
    Comput. Sci. Dept., Jordan Univ. of Sci. & Technol., Irbid, Jordan
  • fYear
    2013
  • fDate
    3-5 Dec. 2013
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    The emergence of the Web 2.0 technology generated a massive amount of raw data by enabling Internet users to post their opinions, reviews, comments on the web. Processing this raw data to extract useful information can be a very challenging task. An example of important information that can be automatically extracted from the users´ posts and comments is their opinions on different issues, events, services, products, etc. This problem of Sentiment Analysis (SA) has been studied well on the English language and two main approaches have been devised: corpus-based and lexicon-based. This paper addresses both approaches to SA for the Arabic language. Since there is a limited number of publically available Arabic dataset and Arabic lexicons for SA, this paper starts by building a manually annotated dataset and then takes the reader through the detailed steps of building the lexicon. Experiments are conducted throughout the different stages of this process to observe the improvements gained on the accuracy of the system and compare them to corpus-based approach.
  • Keywords
    Internet; data mining; human factors; natural languages; social networking (online); text analysis; Arabic language; Arabic lexicons; Arabic sentiment analysis; English language; Internet users; Web 2.0 technology; corpus-based approach; corpus-based sentiment analysis; information extraction; lexicon-based sentiment analysis; publicly available Arabic dataset; user comments; user posts; Accuracy; Buildings; Data mining; Dictionaries; Internet; Niobium; Support vector machines; Arabic language; Corpus-based; Lexicon-based; Opinion mining; Sentiment analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Applied Electrical Engineering and Computing Technologies (AEECT), 2013 IEEE Jordan Conference on
  • Conference_Location
    Amman
  • Print_ISBN
    978-1-4799-2305-2
  • Type

    conf

  • DOI
    10.1109/AEECT.2013.6716448
  • Filename
    6716448