DocumentCode
2798071
Title
A Novel Algorithm to Extract Tri-Literal Arabic Roots
Author
Momani, Mohanned ; Faraj, Jamil
Author_Institution
AABFS, Amman
fYear
2007
fDate
13-16 May 2007
Firstpage
309
Lastpage
315
Abstract
Stemming role and root extraction in the context of information retrieval systems is significant particularly for the Arabic language. In this article, we proposed and implemented a novel algorithm to extract tri-literal Arabic roots. Rootless words are filtered out then prefixes and suffixes removal is performed. Double letters that belong to the Arabic word are removed after sorting term letters. Letter removal is conducted until three letters are remained. Finally, the remaining letters are arranged according to their order in the original word. The implementation of the algorithm has been tested on two types of Arabic text documents. The results of both runs were very promising and satisfactory showing over 73% of accuracy.
Keywords
feature extraction; information retrieval; natural language processing; query languages; Arabic language; Arabic text documents; information retrieval systems; letter removal; prefixes-suffixes removal; stemming role; triliteral arabic root extraction; Algorithm design and analysis; Data mining; Information retrieval; Pattern matching; Shape; Sorting; Surface morphology; Testing; Visual BASIC;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Systems and Applications, 2007. AICCSA '07. IEEE/ACS International Conference on
Conference_Location
Amman
Print_ISBN
1-4244-1030-4
Electronic_ISBN
1-4244-1031-2
Type
conf
DOI
10.1109/AICCSA.2007.370899
Filename
4230974
Link To Document