DocumentCode :
183014
Title :
Hub selection for hub based clustering algorithms
Author :
Zhenfeng He
Author_Institution :
Coll. of Math. & Comput. Sci., Fuzhou Univ., Fuzhou, China
fYear :
2014
fDate :
19-21 Aug. 2014
Firstpage :
479
Lastpage :
484
Abstract :
Hubs are the data instances appearing frequently on the nearest neighbours lists. As the hubs of a high-dimensional dataset are close to the centres of clusters or sub-clusters, some of them are selected as the centres of clusters by hub based clustering algorithms. In the process of hub selection, these algorithms rank data instances in terms of their global hubness scores computed upon their nearest neighbours lists, ignoring cluster related information such as their labels, their and their related instances´ clustering quality. As a result, some suitable hubs may be neglected. To solve this problem, we suggest evaluating instances by their relative hubness scores. Moreover, we propose a weighted relative hubness score computed upon nearest neighbours lists and silhouette information. Besides, we suggest selecting the instance of the highest silhouette information when two or more instances tie for first place. Experimental results on real datasets and synthetic datasets suggest that both the relative hubness score and the weighted relative hubness score can improve hub based clustering, and the weighted relative hubness score often plays better.
Keywords :
pattern clustering; set theory; data instance ranking; global hubness scores; high-dimensional dataset; hub based clustering improvement; hub selection; hub-based clustering algorithms; instance selection; nearest neighbour lists; real datasets; silhouette information; subclusters; synthetic datasets; weighted relative hubness score; Algorithm design and analysis; Clustering algorithms; Computational efficiency; Educational institutions; Indexes; Partitioning algorithms; Probabilistic logic; Clustering; High-dimensional data; Hubness; Silhouette Information;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2014 11th International Conference on
Conference_Location :
Xiamen
Print_ISBN :
978-1-4799-5147-5
Type :
conf
DOI :
10.1109/FSKD.2014.6980881
Filename :
6980881
Link To Document :
بازگشت