• DocumentCode
    629554
  • Title

    Wikipedia based semantic smoothing for twitter sentiment classification

  • Author

    Torunoglu, Dilara ; Telseren, Gurkan ; Sagturk, Ozgun ; Ganiz, Murat Can

  • Author_Institution
    Comput. Eng. Dept., Dogus Univ., Istanbul, Turkey
  • fYear
    2013
  • fDate
    19-21 June 2013
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    Sentiment classification is one of the important and popular application areas for text classification in which texts are labeled as positive and negative. Moreover, Naïve Bayes (NB) is one of the mostly used algorithms in this area. NB having several advantages on lower complexity and simpler training procedure, it suffers from sparsity. Smoothing can be a solution for this problem, mostly Laplace Smoothing is used; however in this paper we propose Wikipedia based semantic smoothing approach. In our study we extend semantic approach by using Wikipedia article titles that exist in training documents, categories and redirects of these articles as topic signatures. Results of the extensive experiments show that our approach improves the performance of NB and even can exceed the accuracy of SVM on Twitter Sentiment 140 dataset.
  • Keywords
    Bayes methods; Web sites; learning (artificial intelligence); pattern classification; smoothing methods; social networking (online); support vector machines; text analysis; Laplace smoothing; NB performance; Naïve Bayes; SVM accuracy; Twitter sentiment 140 dataset; Twitter sentiment classification; text classification; topic signatures; training documents; training procedure; wikipedia article titles; wikipedia based semantic smoothing; Electronic publishing; Encyclopedias; Internet; Niobium; Semantics; Smoothing methods; semantic smoothing; text classification; twitter corpus; wiki concept; wikipedi; wikipedia;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Innovations in Intelligent Systems and Applications (INISTA), 2013 IEEE International Symposium on
  • Conference_Location
    Albena
  • Print_ISBN
    978-1-4799-0659-8
  • Type

    conf

  • DOI
    10.1109/INISTA.2013.6577649
  • Filename
    6577649