Title :
Unsupervised feature selection based on clustering
Author :
Jiang, ShengYi ; Wang, Lianxi
Author_Institution :
Sch. of Inf., Guangdong Univ. of Foreign Studies, Guangzhou, China
Abstract :
Feature selection plays an important part in improving the classification accuracy and the quality of clustering in many applications. Feature selection has been widely studied in supervised learning, but in unsupervised learning it is still relatively rare. In this paper, a novel definition of feature differentiation for identifying (determining) the relatively important features is presented, and a one-pass clustering-based feature selection approach is introduced. The new method with nearly linear time complexity selects the optimal subset according to the variation of the feature differentiation. Experimental results on UCI datasets show that our method, by removing the irrelevant or redundant features, can achieve promising classification and clustering results for most datasets. Compared with other traditional feature selection approaches the proposed algorithm has obtained similar or even better performance in terms of dimensionality reduction and classification accuracy.
Keywords :
computational complexity; feature extraction; pattern classification; pattern clustering; unsupervised learning; UCI dataset; clustering; dimensionality reduction; feature differentiation; feature selection; linear time complexity; unsupervised learning; DNA; Glass; Heart; Iris; Liver; Sonar; Vehicles;
Conference_Titel :
Bio-Inspired Computing: Theories and Applications (BIC-TA), 2010 IEEE Fifth International Conference on
Conference_Location :
Changsha
Print_ISBN :
978-1-4244-6437-1
DOI :
10.1109/BICTA.2010.5645319