Title :
A new approach for semi-supervised clustering based on Fuzzy C-Means
Author :
Macario, Valmir ; de Carvalho, F.A.T.
Author_Institution :
Center of Inf., Fed. Univ. of Pernambuco, Recife, Brazil
Abstract :
In traditional machine learning applications, only labeled data is used to train the classifier. Labeled data are difficult, expensive, time-consuming and require human experts to be obtained in several real applications. Semi-supervised learning address this issue. Semi-supervised learning uses large amount of unlabeled data, combined with the labeled data, to build better classifiers. The semi-supervised algorithm could be an extension of an unsupervised algorithm. Such algorithm would be based on unsupervised clustering algorithms, adding a term in its objective function that makes use of labeled information to guide the learning process. This study presents a new algorithm for semi-supervised clustering based on Fuzzy C-Means algorithm. The classifier was evaluated and compared against two semi-supervised clustering algorithms in the context of learning from partially labeled data. The behavior of the proposed algorithm is discussed and the results are validated using cross-validation and the confidence interval. Thus, it was possible to certify the better accuracy performance of the new algorithm when a few labeled data are available.
Keywords :
fuzzy set theory; pattern classification; pattern clustering; unsupervised learning; classifier; fuzzy C-means clustering; labeled data; machine learning; semisupervised clustering; unlabeled data; unsupervised clustering algorithm; Accuracy; Algorithm design and analysis; Clustering algorithms; Mathematical model; Partitioning algorithms; Prototypes; Training;
Conference_Titel :
Fuzzy Systems (FUZZ), 2010 IEEE International Conference on
Conference_Location :
Barcelona
Print_ISBN :
978-1-4244-6919-2
DOI :
10.1109/FUZZY.2010.5584306