DocumentCode :
2527692
Title :
A System for Evaluation of Arabic Root Extraction Methods
Author :
El Salam Al Hajjar, Abd ; Hajjar, Mohammad ; Zreik, Khaldoun
Author_Institution :
Paragraph Lab., Univ. of Paris 8, Vincennes-Saint-Denis, France
fYear :
2010
fDate :
9-15 May 2010
Firstpage :
506
Lastpage :
512
Abstract :
In this article, we present a new application that evaluated the performance of a number of the Arabic root extraction methods. The implemented methods in this system are selected according to a previous classification, where these methods are classified into five categories. We have selected a method for each category. These methods are: Light Stemmer, Arabic Stemming without a root dictionary, MT-based Arabic Stemmer, N-gram based on similarity coefficient and N-gram based on dissimilarity coefficient. This evaluation was conducted on the same terms in a corpus of two thousand words and their roots. These words are taken from Arabic dictionary "Lesan Al-Arab". This application has allowed us to have a first original comparison between the evaluated methods. This system works in two ways: normal and automatic.
Keywords :
information retrieval; languages; Arabic root extraction methods; Arabic stemming; N-gram based on dissimilarity coefficient; N-gram based on similarity coefficient; light stemmer; root dictionary; Data mining; Dictionaries; Information retrieval; Laboratories; Tagging; Testing; Vocabulary; Web and internet services; Arabic language; Dictionary; Evaluation; Information extraction; N-gram; Stemmer;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Internet and Web Applications and Services (ICIW), 2010 Fifth International Conference on
Conference_Location :
Barcelona
Print_ISBN :
978-1-4244-6728-0
Type :
conf
DOI :
10.1109/ICIW.2010.98
Filename :
5476492
Link To Document :
بازگشت