Title :
IPIC separability ratio for semi-supervised feature selection
Author :
Yeung, Daniel S. ; Wang, Jun ; Ng, Wing W Y
Author_Institution :
Sch. of Comput. Sci. & Eng., South China Univ. of Technol., Guangzhou, China
Abstract :
Owing to the increase of computer processing power, datasets for pattern classification problems get larger. In semi-supervised learning problems, only a portion of training samples are labeled. This further reduces the effort in data collection. In the light of this situation, the number of features increases easily and the curse of dimensionality becomes a serious problem in semi-supervised pattern classification problems. In classical class separability based feature selection methods, samples in the same class are assumed to belong to a single cluster. However, this assumption does not hold true in many real life pattern classification problems. Instead, samples in a class may form several clusters (prototypes) [23]. These facts motivate us to propose the intra-prototype / inter-class separability ratio (IPICSR). Features selected by the IPICSR tend to preserve locality within a prototype while maximizing inter-class difference. Experimental results show that average testing accuracies of the proposed method out-performs off-the-shelf methods.
Keywords :
learning (artificial intelligence); pattern classification; IPIC separability ratio; class separability; data collection; interclass difference; intraprototype-interclass separability ratio; pattern classification problem; semisupervised feature selection; semisupervised learning problem; Clustering algorithms; Computer science; Cybernetics; Filters; Internet; Laplace equations; Machine learning; Pattern classification; Prototypes; Semisupervised learning; Semi-supervised Feature Selection; Separability Ratio;
Conference_Titel :
Machine Learning and Cybernetics, 2009 International Conference on
Conference_Location :
Baoding
Print_ISBN :
978-1-4244-3702-3
Electronic_ISBN :
978-1-4244-3703-0
DOI :
10.1109/ICMLC.2009.5212468