DocumentCode
188134
Title
Cross-Lingual Short-Text Document Classification for Facebook Comments
Author
Faqeeh, Mosab ; Abdulla, Nawaf ; Al-Ayyoub, Mahmoud ; Jararweh, Yaser ; Quwaider, Muhannad
Author_Institution
Jordan Univ. of Sci. & Technol., Irbid, Jordan
fYear
2014
fDate
27-29 Aug. 2014
Firstpage
573
Lastpage
578
Abstract
Document Classification (DC) is one of the fundamental problems in text mining. Plenty of works exist on DC with interesting approaches and excellent results, however, most of them focus on a long-text documents written in a single language with English being the most studied language. This work is concerned with the natural step beyond such works which is cross-lingual DC for short-text documents. Specifically, we consider two languages, Arabic and English, and compare the performance of some of the most popular document classifiers on two datasets of short Facebook comments. Apart from limited attempts, the addressed problem has not been studied well enough. The results are encouraging and new insights are obtained.
Keywords
pattern classification; social networking (online); text analysis; DC; English; Facebook comments; cross-lingual short-text document classification; long-text documents; text mining; Accuracy; Facebook; Niobium; Sentiment analysis; Support vector machines; Text categorization; cross-lingual text analysis; decision tree; document classification; k-nearest neighbor; naive Bayes; social network comments; support vector machine;
fLanguage
English
Publisher
ieee
Conference_Titel
Future Internet of Things and Cloud (FiCloud), 2014 International Conference on
Conference_Location
Barcelona
Type
conf
DOI
10.1109/FiCloud.2014.99
Filename
6984255
Link To Document