• DocumentCode
    3334550
  • Title

    A contextual normalised edit distance

  • Author

    de la Higuer, Colin ; Micó, Luisa

  • Author_Institution
    Lab. Hubert Curien, Univ. de St.-Etienne, St. Etienne
  • fYear
    2008
  • fDate
    7-12 April 2008
  • Firstpage
    354
  • Lastpage
    361
  • Abstract
    In order to better fit a variety of pattern recognition problems over strings, using a normalised version of the edit or Levenshtein distance is considered to be an appropriate approach. The goal of normalisation is to take into account the lengths of the strings. We define a new normalisation, contextual, where each edit operation is divided by the length of the string on which the edit operation takes place. We prove that this contextual edit distance is a metric and that it can be computed through an extension of the usual dynamic programming algorithm for the edit distance. We also provide a fast heuristic which nearly always returns the same result and we show over several experiments that the distance obtains good results in classification tasks and has a low intrinsic dimension in comparison with other normalised edit distances.
  • Keywords
    dynamic programming; pattern classification; string matching; Levenshtein distance; classification task; contextual normalised edit distance; dynamic programming algorithm; pattern recognition problem; Acceleration; Computational biology; Dynamic programming; Heuristic algorithms; Mathematics; Measurement standards; Pattern recognition; Space exploration; Topology; Tree graphs;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering Workshop, 2008. ICDEW 2008. IEEE 24th International Conference on
  • Conference_Location
    Cancun
  • Print_ISBN
    978-1-4244-2161-9
  • Electronic_ISBN
    978-1-4244-2162-6
  • Type

    conf

  • DOI
    10.1109/ICDEW.2008.4498345
  • Filename
    4498345