Title of article :
Using N-Grams for Arabic Text Searching
Author/Authors :
Suleiman H. Mustafa and Qasem A. Al-Radaideh، نويسنده ,
Issue Information :
ماهنامه با شماره پیاپی سال 2004
Abstract :
N-grams have been widely investigated for a number of
text processing and retrieval applications. This article
examines the performance of the digram and trigram
term conflation techniques in the context of Arabic free
text retrieval. It reports the results of using the N-gram
approach for a corpus of thousands of distinct textual
words drawn from a number of sources representing
various disciplines. The results indicate that the digram
method offers a better performance than trigram with
respect to conflation precision and conflation recall ratios.
In either case, the N-gram approach does not appear
to provide an efficient conflation approach due to
the peculiarities imposed by the Arabic infix structure
that reduces the rate of correct N-gram matching.
Journal title :
Journal of the American Society for Information Science and Technology
Journal title :
Journal of the American Society for Information Science and Technology