مرکز منطقه ای اطلاع رساني علوم و فناوري - Kernels based on weighted Levenshtein distance

DocumentCode :

2329220

Title :

Kernels based on weighted Levenshtein distance

Author :

Xu, Jianhua ; Zhang, Xuegong

Author_Institution :

Sch. of Math. & Comput. Sci., Nanjing Normal Univ., China

Volume :

fYear :

2004

fDate :

25-29 July 2004

Firstpage :

3015

Abstract :

In some real world applications, the sample could be described as a string of symbols rather than a vector of real numbers. It is necessary to determine the similarity or dissimilarity of two strings in many training algorithms. The widely used notion of similarity of two strings with different lengths is the weighted Levenshtein distance (WLD), which implies the minimum total weights of single symbol insertions, deletions and substitutions required to transform one string into another. In order to incorporate prior knowledge of strings into kernels used in support vector machine and other kernel machines, we utilize variants of this distance to replace distance measure in the RBF and exponential kernels and inner product in polynomial and sigmoid kernels, and form a new class of string kernels: Levenshtein kernels in this paper. Combining our new kernels with support vector machine, the error rate and variance on UCI splice site recognition dataset over 20 run is 5.88∓0.53, which is better than the best result 9.5∓0.7 from other five training algorithms.

Keywords :

pattern recognition; radial basis function networks; support vector machines; exponential kernels; kernel machines; polynomial kernels; radial basis function; sigmoid kernels; support vector machine; training algorithms; weighted Levenshtein distance; Application software; DNA; Error analysis; Hidden Markov models; Kernel; Machine learning algorithms; Pattern recognition; Sequences; Support vector machines; Text categorization;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Neural Networks, 2004. Proceedings. 2004 IEEE International Joint Conference on

ISSN :

1098-7576

Print_ISBN :

0-7803-8359-1

Type :

conf

DOI :

10.1109/IJCNN.2004.1381147

Filename :

1381147

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2329220