Title of article :
Word-Oriented Approximate String Matching Using
Occurrence Heuristic Tables: A Heuristic for Searching
Arabic Text
Author/Authors :
Suleiman H. Mustafa، نويسنده ,
Issue Information :
ماهنامه با شماره پیاپی سال 2005
Abstract :
In this article, a word-oriented approximate string matching
approach for searching Arabic text is presented. The
distance between a pair of words is determined on the
basis of aligning the two words by using occurrence
heuristic tables. Two words are considered related if
they have the same morphological or lexical basis. The
heuristic reports an approximate match if common letters
agree in order and noncommon letters represent
valid affixes. The heuristic was tested by using four
different alignment strategies: forward, backward, combined
forward–backward, and combined backward–
forward. Using the error rate and missing rate as
performance indicators, the approach was successful in
providing more than 80% correct matches. Within the
conditions of the experiments performed, the results
indicated that the combined forward–backward strategy
seemed to exhibit the best performance. Most of the
errors were caused by multiple-letter occurrences and
by the presence of weak letters in cases in which the
shared core consisted of one or two letters.
Journal title :
Journal of the American Society for Information Science and Technology
Journal title :
Journal of the American Society for Information Science and Technology