DocumentCode :
623274
Title :
A hybrid approach to improving clustering accuracy using SVM
Author :
Shah, Zawar ; Mahmood, Abdun Naser ; Mustafa, Abdul K.
Author_Institution :
Dept. of Comput. Sci., City Univ. of Sci. & IT, Peshawar, Pakistan
fYear :
2013
fDate :
19-21 June 2013
Firstpage :
783
Lastpage :
788
Abstract :
Support Vector Machines (SVMs) have been used in many areas such as regression, classification and novelity detection due to its accuracy and generalizability. Recently SVMs have been proposed for clustering analysis as well. Support Vector Clustering (SVC) works by finding the minimum enclosing sphere of data points using SVM training. SVC is a boundary based clustering method, where the support information is used to construct cluster boundaries. In support vector-based clustering algorithms, the main computational bottle-neck is the high cluster labeling time for each data point. In addition, in many cases labeled data is not available for use with SVC. This tends to restrict the scalability of the method and results in decreased efficiency. This also decreases the applicability of the SVC method to real-life datasets most of which do not have any class labels.. In this paper we present a technique that could be used to utilize SVM to improve the accuracy of clustering without the need of labeled dataset. We have used K-Means clustering algorithm to generate initial labels from the data and in the next step we have trained a Sequential Minimal Optimization (SMO) classifier on these labels. The original data set is then tested using the trained SMO classifier to improve classification accuracy. This process is continued iteratively and stops when further improvement is not possible. The proposed approach is compared against the popular Stephen winters-Hilt [1] approach and achieves 94% accuracy when applied to benchmark datasets.
Keywords :
pattern classification; pattern clustering; support vector machines; SMO classifier; SVC; SVM training; Stephen winters-Hilt approach; benchmark datasets; boundary based clustering method; cluster boundaries; clustering accuracy; clustering analysis; data points minimum enclosing sphere; high cluster labeling time; hybrid approach; k-means clustering algorithm; sequential minimal optimization classifier; support information; support vector machines; support vector-based clustering algorithms; Accuracy; Clustering algorithms; Kernel; Optimization; Static VAr compensators; Support vector machines; Training; K-Means; Labeling data; SVM;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Industrial Electronics and Applications (ICIEA), 2013 8th IEEE Conference on
Conference_Location :
Melbourne, VIC
Print_ISBN :
978-1-4673-6320-4
Type :
conf
DOI :
10.1109/ICIEA.2013.6566473
Filename :
6566473
Link To Document :
بازگشت