DocumentCode :
515425
Title :
Naive Bayes Classifier based Arabic document categorization
Author :
Noaman, Hatem M. ; Elmougy, Samir ; Ghoneim, Ahmed ; Hamza, Taher
Author_Institution :
Fac. of Comput. & Inf. Sci., Mansoura Univ., Mansoura, Egypt
fYear :
2010
fDate :
28-30 March 2010
Firstpage :
1
Lastpage :
5
Abstract :
Text Categorization aims to assign an electronic document to one or more categories based on its contents. Due to the rapid growth of the number of online Arabic documents, the information libraries and Arabic document corpus, automatic Arabic document classification becomes an important task. This paper suggests the use of rooting algorithm with Nai¿ve Bayes Classifier to the problem of document categorization of Arabic language and reports the algorithm performance in terms of error rate, accuracy, and micro-average recall measures. Our experimental study shows that using rooting algorithm with Nai¿ve Bayes (NB) Classifier gives ~62.23% average accuracy and decreases the dimensionality of the training documents.
Keywords :
belief networks; natural language processing; text analysis; Arabic document categorization; electronic document; naive Bayes classifier; rooting algorithm; text categorization; Classification tree analysis; Inference algorithms; Information filtering; Information filters; Information retrieval; Machine learning; Machine learning algorithms; Niobium; Support vector machines; Text categorization; Naïve Bayes classifier; document categorization; machine learning; natural language processing for Arabic language;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Informatics and Systems (INFOS), 2010 The 7th International Conference on
Conference_Location :
Cairo
Print_ISBN :
978-1-4244-5828-8
Type :
conf
Filename :
5461819
Link To Document :
بازگشت