DocumentCode :
2040674
Title :
Fuzzy c-means clustering of partially missing data sets based on statistical representation
Author :
Li, Dan ; Zhong, Chongquan ; Zhang, Liyong
Author_Institution :
Sch. of Electron. & Inf. Eng., Dalian Univ. of Technol., Dalian, China
Volume :
1
fYear :
2010
fDate :
10-12 Aug. 2010
Firstpage :
460
Lastpage :
464
Abstract :
The fuzzy c-means algorithm is a useful technique for clustering real s-dimensional data, but it can not be directly used for partially missing data sets. In this paper, the problem of missing data handling for fuzzy clustering is considered, and a statistical representation of missing attributes is proposed. The approach reduces the statistical analysis of missing attributes to the subsets of the dataset with similar data of incomplete data to impute the missing attributes, thus is helpful in enhancing the learning of missing attributes and the performance of fuzzy clustering based on the recovered data. Comparisons and analysis of the clustering results of the incomplete IRIS data demonstrate that the proposed statistical representation can estimate missing attributes rationally and improve the fuzzy c-means clustering of incomplete data.
Keywords :
fuzzy set theory; pattern clustering; statistical analysis; data handling; fuzzy c-means clustering; fuzzy clustering; partially missing data sets; statistical representation; Algorithm design and analysis; Clustering algorithms; Iris; Nearest neighbor searches; Partitioning algorithms; Prototypes; Statistical analysis; clustering; fuzzy c-means; incomplete data; statistical representation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2010 Seventh International Conference on
Conference_Location :
Yantai, Shandong
Print_ISBN :
978-1-4244-5931-5
Type :
conf
DOI :
10.1109/FSKD.2010.5569767
Filename :
5569767
Link To Document :
بازگشت