DocumentCode :
1111903
Title :
Weighted Mahalanobis Distance Kernels for Support Vector Machines
Author :
Wang, Defeng ; Yeung, Daniel S. ; Tsang, Eric C C
Author_Institution :
Chinese Univ. of Hong Kong, Shatin
Volume :
18
Issue :
5
fYear :
2007
Firstpage :
1453
Lastpage :
1462
Abstract :
The support vector machine (SVM) has been demonstrated to be a very effective classifier in many applications, but its performance is still limited as the data distribution information is underutilized in determining the decision hyperplane. Most of the existing kernels employed in nonlinear SVMs measure the similarity between a pair of pattern images based on the Euclidean inner product or the Euclidean distance of corresponding input patterns, which ignores data distribution tendency and makes the SVM essentially a ldquolocalrdquo classifier. In this paper, we provide a step toward a paradigm of kernels by incorporating data specific knowledge into existing kernels. We first find the data structure for each class adaptively in the input space via agglomerative hierarchical clustering (AHC), and then construct the weighted Mahalanobis distance (WMD) kernels using the detected data distribution information. In WMD kernels, the similarity between two pattern images is determined not only by the Mahalanobis distance (MD) between their corresponding input patterns but also by the sizes of the clusters they reside in. Although WMD kernels are not guaranteed to be positive definite (pd) or conditionally positive definite (cpd), satisfactory classification results can still be achieved because regularizers in SVMs with WMD kernels are empirically positive in pseudo-Euclidean (pE) spaces. Experimental results on both synthetic and real-world data sets show the effectiveness of ldquopluggingrdquo data structure into existing kernels.
Keywords :
geometry; pattern clustering; support vector machines; Euclidean distance; Euclidean inner product; agglomerative hierarchical clustering; data distribution information; data structure; pattern images; support vector machines; weighted mahalanobis distance kernels; Bioinformatics; Data structures; Euclidean distance; Image processing; Intrusion detection; Kernel; Pattern recognition; Statistical learning; Support vector machine classification; Support vector machines; Indefinite kernels; pattern recognition; support vector machines (SVMs); Algorithms; Artificial Intelligence; Computer Simulation; Models, Statistical; Pattern Recognition, Automated; Reproducibility of Results; Sensitivity and Specificity;
fLanguage :
English
Journal_Title :
Neural Networks, IEEE Transactions on
Publisher :
ieee
ISSN :
1045-9227
Type :
jour
DOI :
10.1109/TNN.2007.895909
Filename :
4298136
Link To Document :
بازگشت