DocumentCode :
813984
Title :
A Normalized Levenshtein Distance Metric
Author :
Yujian, Li ; Bo, Liu
Volume :
29
Issue :
6
fYear :
2007
fDate :
6/1/2007 12:00:00 AM
Firstpage :
1091
Lastpage :
1095
Abstract :
Although a number of normalized edit distances presented so far may offer good performance in some applications, none of them can be regarded as a genuine metric between strings because they do not satisfy the triangle inequality. Given two strings X and Y over a finite alphabet, this paper defines a new normalized edit distance between X and Y as a simple function of their lengths (|X| and |Y|) and the Generalized Levenshtein Distance (GLD) between them. The new distance can be easily computed through GLD with a complexity of O(|X| cdot |Y|) and it is a metric valued in [0, 1] under the condition that the weight function is a metric over the set of elementary edit operations with all costs of insertions/deletions having the same weight. Experiments using the AESA algorithm in handwritten digit recognition show that the new distance can generally provide similar results to some other normalized edit distances and may perform slightly better if the triangle inequality is violated in a particular data set.
Keywords :
Biomedical signal processing; Computational biology; Cost function; Error correction; Handwriting recognition; Image recognition; Information retrieval; Pattern recognition; Sequences; Signal processing algorithms; AESA.; Levenshtein distance; Sequence comparison; metric; normalized edit distance; Algorithms; Artificial Intelligence; Image Enhancement; Image Interpretation, Computer-Assisted; Imaging, Three-Dimensional; Information Storage and Retrieval; Models, Statistical; Pattern Recognition, Automated;
fLanguage :
English
Journal_Title :
Pattern Analysis and Machine Intelligence, IEEE Transactions on
Publisher :
ieee
ISSN :
0162-8828
Type :
jour
DOI :
10.1109/TPAMI.2007.1078
Filename :
4160958
Link To Document :
بازگشت