DocumentCode :
2129843
Title :
Semi-supervised Collaborative Clustering with Partial Background Knowledge
Author :
Forestier, Germain ; Wemmert, Cédric ; Gancarski, Pierre
Author_Institution :
LSIIT, Univ. of Strasbourg, Illkirch
fYear :
2008
fDate :
15-19 Dec. 2008
Firstpage :
211
Lastpage :
217
Abstract :
In this paper we present a new algorithm for semisupervised clustering. We assume to have a small set of labeled samples and we use it in a clustering algorithm to discover relevant patterns. We study how our algorithm works against two other semisupervised algorithms when the data are multimodal. Then, we study the case where the user is able to produce few samples for some classes but not for each class of the dataset. Indeed, in complex problems, the user is not always able to produce samples for each class present in the dataset. The challenging task is consequently to use the set of labeled samples to discover other members of these classes, but also to keep a degree of freedom to discover unknown clusters, for which samples are not available. We address this problem through a series of experimentations on synthetic datasets, to show the relevance of the proposed method.
Keywords :
data mining; learning (artificial intelligence); pattern classification; pattern clustering; data classification; data mining; labeled sample set; partial background knowledge; semisupervised collaborative clustering; synthetic dataset; Classification algorithms; Clustering algorithms; Collaborative work; Conferences; Data mining; International collaboration; Partitioning algorithms; Semisupervised learning; collaborative clustering; knowledge integration; semi-supervised clustering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining Workshops, 2008. ICDMW '08. IEEE International Conference on
Conference_Location :
Pisa
Print_ISBN :
978-0-7695-3503-6
Electronic_ISBN :
978-0-7695-3503-6
Type :
conf
DOI :
10.1109/ICDMW.2008.116
Filename :
4733939
Link To Document :
بازگشت