Title :
Clustering Inside Classes Improves Performance of Linear Classifiers
Author :
Fradkin, Dmitriy
Author_Institution :
Siemens Corp. Res., Princeton, NJ
Abstract :
This work systematically examines a Clustering Inside Classes (CIC) approach to classification. In CIC, each class is partitioned into subclasses based on cluster analysis. We find that CIC, by extracting local structure and producing compact subclasses, can improve performance of linear classifiers such as SVM and logistic regression. It is compared against a global classifier on four benchmark datasets. We empirically analyze effects of the training set size and the number of clusters per class on the results of the CIC approach. We also examine use of an automated method for selecting the number of clusters for each class.
Keywords :
pattern classification; pattern clustering; SVM; cluster analysis; clustering inside classes approach; linear classifiers; logistic regression; Artificial intelligence; Clustering algorithms; Data analysis; Educational institutions; Logistics; Partitioning algorithms; Shape; Support vector machine classification; Support vector machines; SVM; classification; cluster analysis; logistic regression;
Conference_Titel :
Tools with Artificial Intelligence, 2008. ICTAI '08. 20th IEEE International Conference on
Conference_Location :
Dayton, OH
Print_ISBN :
978-0-7695-3440-4
DOI :
10.1109/ICTAI.2008.29