Title :
Weighted kernel functions for SVM learning in string domains: a distance function viewpoint
Author :
Vanschoenwinkel ; Liu, Feng ; Manderick, Bernard
Author_Institution :
Dept. of Informatics, Vrije Univ. Brussel, Belgium
Abstract :
This paper extends the idea of weighted distance functions to kernels and support vector machines. Here, we focus on applications that rely on sliding a window over a sequence of string data. For this type of problems it is argued that a symbolic, context-based representation of the data should be preferred over a continuous, real format as this is a much more intuitive setting for working with (weighted) distance functions. It is shown how a weighted string distance can be decomposed and subsequently used in different kernel functions and how these kernel functions correspond to inner products between real vectors. As a case-study named entity recognition is used with information gain ratio as a weighting scheme.
Keywords :
data structures; learning (artificial intelligence); sequences; string matching; support vector machines; SVM learning; data representation; information gain ratio; named entity recognition; string data sequence; string domain; symbolic context-based representation; weighted distance function; weighted kernel functions; Computational modeling; Cybernetics; Discrete transforms; Informatics; Kernel; Machine learning; Position measurement; Space technology; Support vector machines; In formation Gain Ratio; Kernel Functions; Metrics; Named Entity Recognition; Support Vector Machines;
Conference_Titel :
Machine Learning and Cybernetics, 2005. Proceedings of 2005 International Conference on
Conference_Location :
Guangzhou, China
Print_ISBN :
0-7803-9091-1
DOI :
10.1109/ICMLC.2005.1527679