• DocumentCode
    2798071
  • Title

    A Novel Algorithm to Extract Tri-Literal Arabic Roots

  • Author

    Momani, Mohanned ; Faraj, Jamil

  • Author_Institution
    AABFS, Amman
  • fYear
    2007
  • fDate
    13-16 May 2007
  • Firstpage
    309
  • Lastpage
    315
  • Abstract
    Stemming role and root extraction in the context of information retrieval systems is significant particularly for the Arabic language. In this article, we proposed and implemented a novel algorithm to extract tri-literal Arabic roots. Rootless words are filtered out then prefixes and suffixes removal is performed. Double letters that belong to the Arabic word are removed after sorting term letters. Letter removal is conducted until three letters are remained. Finally, the remaining letters are arranged according to their order in the original word. The implementation of the algorithm has been tested on two types of Arabic text documents. The results of both runs were very promising and satisfactory showing over 73% of accuracy.
  • Keywords
    feature extraction; information retrieval; natural language processing; query languages; Arabic language; Arabic text documents; information retrieval systems; letter removal; prefixes-suffixes removal; stemming role; triliteral arabic root extraction; Algorithm design and analysis; Data mining; Information retrieval; Pattern matching; Shape; Sorting; Surface morphology; Testing; Visual BASIC;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Systems and Applications, 2007. AICCSA '07. IEEE/ACS International Conference on
  • Conference_Location
    Amman
  • Print_ISBN
    1-4244-1030-4
  • Electronic_ISBN
    1-4244-1031-2
  • Type

    conf

  • DOI
    10.1109/AICCSA.2007.370899
  • Filename
    4230974