Title of article :
Using N-Grams for Arabic Text Searching
Author/Authors :
Suleiman H. Mustafa and Qasem A. Al-Radaideh، نويسنده ,
Issue Information :
ماهنامه با شماره پیاپی سال 2004
Pages :
6
From page :
1002
To page :
1007
Abstract :
N-grams have been widely investigated for a number of text processing and retrieval applications. This article examines the performance of the digram and trigram term conflation techniques in the context of Arabic free text retrieval. It reports the results of using the N-gram approach for a corpus of thousands of distinct textual words drawn from a number of sources representing various disciplines. The results indicate that the digram method offers a better performance than trigram with respect to conflation precision and conflation recall ratios. In either case, the N-gram approach does not appear to provide an efficient conflation approach due to the peculiarities imposed by the Arabic infix structure that reduces the rate of correct N-gram matching.
Journal title :
Journal of the American Society for Information Science and Technology
Serial Year :
2004
Journal title :
Journal of the American Society for Information Science and Technology
Record number :
843844
Link To Document :
بازگشت