• DocumentCode
    972838
  • Title

    A Possibilistic Approach to String Comparison

  • Author

    Bronselaer, Antoon ; De Tre, Guy

  • Author_Institution
    Dept. of Telecommun. & Inf. Process., Ghent Univ., Ghent
  • Volume
    17
  • Issue
    1
  • fYear
    2009
  • Firstpage
    208
  • Lastpage
    223
  • Abstract
    In this paper, comparison of strings is tackled from a possibilistic point of view. Instead of using the concept of similarity between strings, coreference between strings is adopted. The possibility of coreference is estimated by means of a possibilistic comparison operator. In literature, two important classes of comparison methods for strings have been distinguished: character-based methods and token-based methods. The first class treats a string as a sequence of characters, while the second class treats a string as a vector of substrings. The first contribution of this paper is to propose a new character-based method that is able to detect typographical errors and abbreviations. The main advantage of the proposed technique is the very low complexity in comparison with existing character-based techniques. In a second contribution, two-level systems are investigated and a new approach is described. The novelty of the proposed two-level system is the use of multiset comparison rather than vector comparison. It is shown how an ordered weighted conjunctive operator that uses a parameterized fuzzy quantifier to deliver weights is competitive with frequency-based weights. In addition, the use of a quantifier is significantly faster than the use of existing weight techniques. In a third contribution, a novel class of hybrid techniques is proposed that combines the advantages of several methods. Finally, comparative tests regarding accuracy and execution time are performed and reported.
  • Keywords
    fuzzy logic; fuzzy set theory; mathematical operators; possibility theory; string matching; character-based methods; characters sequence; ordered weighted conjunctive operator; parameterized fuzzy quantifier; possibilistic approach; string comparison; strings coreference; strings similarity; token-based methods; Algorithms; fuzzy logic; operators (mathematics); possibility theory; string matching;
  • fLanguage
    English
  • Journal_Title
    Fuzzy Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1063-6706
  • Type

    jour

  • DOI
    10.1109/TFUZZ.2008.2008025
  • Filename
    4663687