Title of article :
Vicinal support vector classifier using supervised kernel-based clustering
Author/Authors :
Yang، نويسنده , , Xulei and Cao، نويسنده , , Aize and Song، نويسنده , , Qing and Schaefer، نويسنده , , Gerald and Su، نويسنده , , Yi، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2014
Abstract :
AbstractObjective
t vector machines (SVMs) have drawn considerable attention due to their high generalisation ability and superior classification performance compared to other pattern recognition algorithms. However, the assumption that the learning data is identically generated from unknown probability distributions may limit the application of SVMs for real problems. In this paper, we propose a vicinal support vector classifier (VSVC) which is shown to be able to effectively handle practical applications where the learning data may originate from different probability distributions.
s
oposed VSVC method utilises a set of new vicinal kernel functions which are constructed based on supervised clustering in the kernel-induced feature space. Our proposed approach comprises two steps. In the clustering step, a supervised kernel-based deterministic annealing (SKDA) clustering algorithm is employed to partition the training data into different soft vicinal areas of the feature space in order to construct the vicinal kernel functions. In the training step, the SVM technique is used to minimise the vicinal risk function under the constraints of the vicinal areas defined in the SKDA clustering step.
s
mental results on both artificial and real medical datasets show our proposed VSVC achieves better classification accuracy and lower computational time compared to a standard SVM. For an artificial dataset constructed from non-separated data, the classification accuracy of VSVC is between 95.5% and 96.25% (using different cluster numbers) which compares favourably to the 94.5% achieved by SVM. The VSVC training time is between 8.75 s and 17.83 s (for 2–8 clusters), considerable less than the 65.0 s required by SVM. On a real mammography dataset, the best classification accuracy of VSVC is 85.7% and thus clearly outperforms a standard SVM which obtains an accuracy of only 82.1%. A similar performance improvement is confirmed on two further real datasets, a breast cancer dataset (74.01% vs. 72.52%) and a heart dataset (84.77% vs. 83.81%), coupled with a reduction in terms of learning time (32.07 s vs. 92.08 s and 25.00 s vs. 53.31 s, respectively). Furthermore, the VSVC results in the number of support vectors being equal to the specified cluster number, and hence in a much sparser solution compared to a standard SVM.
sion
orating a supervised clustering algorithm into the SVM technique leads to a sparse but effective solution, while making the proposed VSVC adaptive to different probability distributions of the training data.
Keywords :
Support Vector Machines , Kernel-based data clustering , Supervised deterministic annealing , Mammographic mass classification , Biomedical data classification
Journal title :
Artificial Intelligence In Medicine
Journal title :
Artificial Intelligence In Medicine