DocumentCode :
3167434
Title :
Fuzzy granular principal curves algorithm for large data sets
Author :
Hongyun Zhang ; Duoqian Miao ; Pedrycz, Witold
Author_Institution :
Dept. of Comput. Sci. & Technol., Tongji Univ., Shanghai, China
fYear :
2013
fDate :
24-28 June 2013
Firstpage :
956
Lastpage :
961
Abstract :
Principal curves, as a nonlinear generalization of principal components, are a common tool used in multivariate analysis for ends like dimensionality reduction and feature extraction. However, one of the difficulties that arise when utilizing this technique is that efficiency of existing principal curves algorithms is often low when dealing with large data set owing to high computational complexity. In the paper, a new method based on the idea of “information granulation and fuzzy sets” is proposed to improve efficiency and noise robustness. First, large amounts of numerical data are granulated into C interval (granular) data based on the fuzzy C-means cluster and two criteria of granulation, which significantly reduces the amount of data that is to be processed in the later step. Then granular principal curves are constructed according to the upper and the lower bounds of the interval data. Finally we introduce a quantitative index based on the parameter α to evaluate the fuzziness of granular principal curves output, where α is a positive parameter delivering some flexibility when optimizing the information granule. A series of numeric studies completed for synthetic data set provide a useful insight into the effectiveness of the proposed algorithm.
Keywords :
computational complexity; fuzzy set theory; pattern clustering; principal component analysis; C interval data; computational complexity; dimensionality reduction; feature extraction; fuzzy C-means cluster; fuzzy granular principal curves algorithm; fuzzy sets; information granulation; information granule; large data sets; multivariate analysis; principal components; quantitative index; Algorithm design and analysis; Clustering algorithms; Feature extraction; Indexes; Noise; Partitioning algorithms; Robustness; fuzziness quantization; fuzzy C-mean cluster; granular principal curve; interval data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
IFSA World Congress and NAFIPS Annual Meeting (IFSA/NAFIPS), 2013 Joint
Conference_Location :
Edmonton, AB
Type :
conf
DOI :
10.1109/IFSA-NAFIPS.2013.6608529
Filename :
6608529
Link To Document :
بازگشت