DocumentCode :
3289316
Title :
A hybrid BSO-Chi2-SVM approach to Arabic text categorization
Author :
Belkebir, Riadh ; Guessoum, Abderrezak
Author_Institution :
Comput. Sci. Dept., USTHB, El-Alia Bab-Ezzouar, Algeria
fYear :
2013
fDate :
27-30 May 2013
Firstpage :
1
Lastpage :
7
Abstract :
Automatic categorization of documents has become an important task, especially with the rapid growth of the number of documents available online. Automatic categorization of documents consists in assigning a category to a text based on the information it contains. It aims to automate the association of a document with a category. Automatic categorization can allow solving several problems such as identifying the language of a document, the filtering and detection of spam (junk mail), the routing and forwarding of emails to their recipients, etc. In this paper, we present the results of Arabic text categorization based on three different approaches: artificial neural networks, support vector machines (SVMs) and a hybrid approach BSO-CHI-SVM. We explain the approach and present the results of the implementation and evaluation using two types of representations: root-based stemming and light stemming. The evaluation in each case was done on the Open Source Arabic Corpora (OSAC) using different performance measures.
Keywords :
neural nets; support vector machines; text analysis; Arabic text categorization; OSAC; artificial neural networks; automatic document categorization; hybrid BSO-Chi2-SVM approach; light stemming; open source Arabic corpora; root-based stemming; spam detection; spam filtering; support vector machines; Accuracy; Artificial neural networks; Particle swarm optimization; Support vector machines; Text categorization; Training;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Systems and Applications (AICCSA), 2013 ACS International Conference on
Conference_Location :
Ifrane
ISSN :
2161-5322
Type :
conf
DOI :
10.1109/AICCSA.2013.6616437
Filename :
6616437
Link To Document :
بازگشت