Title :
Feature selection for clustering with constraints using Jensen-Shannon divergence
Author :
Li, Yuanhong ; Dong, Ming ; Ma, Yunqian
Author_Institution :
Dept. of Comput. Sci., Wayne State Univ., Detroit, MI
Abstract :
In semi-supervised clustering, domain knowledge can be converted to constraints and used to guide the clustering. In this paper we propose a feature selection algorithm for semi-supervised clustering. In our method, features are conditionally independent. Feature saliency is first computed in unsupervised clustering using the expectation maximization model. Then, it is refined in the tuning step to minimize the feature-wise constraint violation measure, calculated based on the Jensen-Shannon divergence. Experimental results show that a small amount of supervision can improve the performance of clustering and feature selection.
Keywords :
constraint theory; expectation-maximisation algorithm; pattern clustering; Jensen-Shannon divergence; domain knowledge; expectation maximization model; feature saliency; feature selection algorithm; feature-wise constraint violation measure; semisupervised clustering; tuning step; unsupervised clustering; Bioinformatics; Clustering algorithms; Computer science; Data mining; Data structures; Drives; Graph theory; Information retrieval; Mathematical model; Multidimensional systems;
Conference_Titel :
Pattern Recognition, 2008. ICPR 2008. 19th International Conference on
Conference_Location :
Tampa, FL
Print_ISBN :
978-1-4244-2174-9
Electronic_ISBN :
1051-4651
DOI :
10.1109/ICPR.2008.4761805