• DocumentCode
    2923694
  • Title

    A practical comparison of edit distance approximation algorithms

  • Author

    Hanada, Hiroyuki ; Nakamura, Atsuyoshi ; Kudo, Mineichi

  • Author_Institution
    Grad. Sch. of Inf. Sci. & Technol., Hokkaido Univ., Sapporo, Japan
  • fYear
    2011
  • fDate
    8-10 Nov. 2011
  • Firstpage
    231
  • Lastpage
    236
  • Abstract
    The edit distance is a basic string similarity measure for many applications such as string searching, text mining, signal processing, bioinformatics and so on. However, its high computational cost often prevents it from being used for a large set of strings like similar string searching. A promising solution for the problem is to approximate the edit distance with low computational cost. However, although there are many methods for approximating the edit distance, most of them are analyzed only theoretically. In fact, most of the methods can evaluate the edit distance only in terms of order notations, and do not conduct any experiment. This is a large obstacle for applying them to real applications. In this study we will first list up existing edit distance approximation methods. Then we compare them by: (i) approximation performances shown by the original authors, (ii) approximation performances re-analyzed by us (concrete values instead of the order notations) and (iii) experimental performances by our implementations.
  • Keywords
    approximation theory; data mining; string matching; text analysis; approximation performance; bioinformatics; edit distance approximation algorithm; high computational cost; low computational cost; order notation; signal processing; similarity measure; string searching; text mining; Approximation algorithms; Bioinformatics; Computational efficiency; Concrete; Function approximation; Vectors; approximate string matching; edit distance; string replacement; string searching; string vectorization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Granular Computing (GrC), 2011 IEEE International Conference on
  • Conference_Location
    Kaohsiung
  • Print_ISBN
    978-1-4577-0372-0
  • Type

    conf

  • DOI
    10.1109/GRC.2011.6122599
  • Filename
    6122599