Title :
Feature Selection for Cluster Analysis: an Approach Based on the Simplified Silhouette Criterion
Author :
Hruschka, Eduardo R. ; Covões, Thiago F.
Author_Institution :
Catholic Univ. of Santos
Abstract :
This paper explores the problem of selecting relevant features for clustering, assuming that the number of clusters is not known a priori. The number of clusters and the subset of relevant features are usually inter-related. From this standpoint, we propose an exploratory data analysis method that considers the relationships between these two aspects. Empirical results in a number of synthetic and bioinformatics datasets show that the proposed approach can allow both reducing the number of features and providing good estimations of the number of clusters
Keywords :
data analysis; feature extraction; pattern clustering; bioinformatics datasets; cluster analysis; exploratory data analysis method; feature selection; simplified silhouette criterion; Bioinformatics; Clustering algorithms; Clustering methods; Data analysis; Data visualization; Gene expression; Information filtering; Information filters; Supervised learning; Text mining;
Conference_Titel :
Computational Intelligence for Modelling, Control and Automation, 2005 and International Conference on Intelligent Agents, Web Technologies and Internet Commerce, International Conference on
Conference_Location :
Vienna
Print_ISBN :
0-7695-2504-0
DOI :
10.1109/CIMCA.2005.1631238