DocumentCode
972838
Title
A Possibilistic Approach to String Comparison
Author
Bronselaer, Antoon ; De Tre, Guy
Author_Institution
Dept. of Telecommun. & Inf. Process., Ghent Univ., Ghent
Volume
17
Issue
1
fYear
2009
Firstpage
208
Lastpage
223
Abstract
In this paper, comparison of strings is tackled from a possibilistic point of view. Instead of using the concept of similarity between strings, coreference between strings is adopted. The possibility of coreference is estimated by means of a possibilistic comparison operator. In literature, two important classes of comparison methods for strings have been distinguished: character-based methods and token-based methods. The first class treats a string as a sequence of characters, while the second class treats a string as a vector of substrings. The first contribution of this paper is to propose a new character-based method that is able to detect typographical errors and abbreviations. The main advantage of the proposed technique is the very low complexity in comparison with existing character-based techniques. In a second contribution, two-level systems are investigated and a new approach is described. The novelty of the proposed two-level system is the use of multiset comparison rather than vector comparison. It is shown how an ordered weighted conjunctive operator that uses a parameterized fuzzy quantifier to deliver weights is competitive with frequency-based weights. In addition, the use of a quantifier is significantly faster than the use of existing weight techniques. In a third contribution, a novel class of hybrid techniques is proposed that combines the advantages of several methods. Finally, comparative tests regarding accuracy and execution time are performed and reported.
Keywords
fuzzy logic; fuzzy set theory; mathematical operators; possibility theory; string matching; character-based methods; characters sequence; ordered weighted conjunctive operator; parameterized fuzzy quantifier; possibilistic approach; string comparison; strings coreference; strings similarity; token-based methods; Algorithms; fuzzy logic; operators (mathematics); possibility theory; string matching;
fLanguage
English
Journal_Title
Fuzzy Systems, IEEE Transactions on
Publisher
ieee
ISSN
1063-6706
Type
jour
DOI
10.1109/TFUZZ.2008.2008025
Filename
4663687
Link To Document