• DocumentCode
    1114829
  • Title

    A Family of Similarity Measures Between Two Strings

  • Author

    Findler, Nicholas V. ; Leeuwen, Jan Van

  • Author_Institution
    Department of Computer Science, State University of New York at Buffalo, Amherst, NY 14226.
  • Issue
    1
  • fYear
    1979
  • Firstpage
    116
  • Lastpage
    118
  • Abstract
    We present a class of similarity measures for quantitatively comparing two strings, that is, two linearly ordered sets of elements. The strings can be of different lengths, the elements come from a single alphabet, and an element may appear any number of times. The limiting values of each measure are 0, when two completely different strings are compared, and 1, when the two strings are identical. Applications of similarity measures are numerous in nonnumerical computations, such as in heuristic search processes in associative networks, in pattern recognition and classification, in game playing programs, and in music and text analysis. We offer a number of feasible measures from among which some are discarded on plausibility grounds. One can select the measure most adequate for one´s needs on the basis of a few characteristic examples of strings compared and by considering the specific requirements of the application at hand.
  • Keywords
    Application software; Computer networks; Computer science; Pattern recognition; Solids; Text analysis; Classification problems; pattern recognition; search processes; similarity measures between strings; substrings;
  • fLanguage
    English
  • Journal_Title
    Pattern Analysis and Machine Intelligence, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0162-8828
  • Type

    jour

  • DOI
    10.1109/TPAMI.1979.4766885
  • Filename
    4766885