• DocumentCode
    188134
  • Title

    Cross-Lingual Short-Text Document Classification for Facebook Comments

  • Author

    Faqeeh, Mosab ; Abdulla, Nawaf ; Al-Ayyoub, Mahmoud ; Jararweh, Yaser ; Quwaider, Muhannad

  • Author_Institution
    Jordan Univ. of Sci. & Technol., Irbid, Jordan
  • fYear
    2014
  • fDate
    27-29 Aug. 2014
  • Firstpage
    573
  • Lastpage
    578
  • Abstract
    Document Classification (DC) is one of the fundamental problems in text mining. Plenty of works exist on DC with interesting approaches and excellent results, however, most of them focus on a long-text documents written in a single language with English being the most studied language. This work is concerned with the natural step beyond such works which is cross-lingual DC for short-text documents. Specifically, we consider two languages, Arabic and English, and compare the performance of some of the most popular document classifiers on two datasets of short Facebook comments. Apart from limited attempts, the addressed problem has not been studied well enough. The results are encouraging and new insights are obtained.
  • Keywords
    pattern classification; social networking (online); text analysis; DC; English; Facebook comments; cross-lingual short-text document classification; long-text documents; text mining; Accuracy; Facebook; Niobium; Sentiment analysis; Support vector machines; Text categorization; cross-lingual text analysis; decision tree; document classification; k-nearest neighbor; naive Bayes; social network comments; support vector machine;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Future Internet of Things and Cloud (FiCloud), 2014 International Conference on
  • Conference_Location
    Barcelona
  • Type

    conf

  • DOI
    10.1109/FiCloud.2014.99
  • Filename
    6984255