DocumentCode :
3745838
Title :
Topic Identification of Noisy Arabic Texts Using Graph Approaches
Author :
Kheireddine Abainia;Siham Ouamour;Halim Sayoud
Author_Institution :
USTHB Univ., Algiers, France
fYear :
2015
Firstpage :
254
Lastpage :
258
Abstract :
This paper deals with the problem of automatic topic identification of noisy Arabic texts. Actually, there exist several works in this field based on statistical and machine learning approaches for different text categories. Unfortunately, most of the proposed methods are effective in clean and long texts. In this research work, we use an in-house dataset of noisy Arabic texts, which are collected from several Arabic discussion forums related to 6 topics. In this investigation, we propose a graph approach called LIGA for topic identification task. This approach was firstly introduced for language identification field. Moreover, we propose two other extensions in order to enhance LIGA performances. The experiments undergone on the Arabic dataset have shown quite interesting performances, reaching about 98% of accuracy.
Keywords :
"Training","Noise measurement","Text categorization","Text mining","Mathematical model","Discussion forums","Ontologies"
Publisher :
ieee
Conference_Titel :
Database and Expert Systems Applications (DEXA), 2015 26th International Workshop on
ISSN :
1529-4188
Print_ISBN :
978-1-4673-7581-8
Electronic_ISBN :
2378-3915
Type :
conf
DOI :
10.1109/DEXA.2015.63
Filename :
7406302
Link To Document :
بازگشت