Title :
A clustering based on information granularity for high dimensional sparse data
Author :
Zhao, Yaqin ; Zhou, Xianzhong
Author_Institution :
Dept. of Autom., Nanjing Univ. of Sci. & Technol., China
Abstract :
This paper presents an information granularity-based clustering algorithm that proceeds from smaller granules to larger granules. Initial clustering is performed directly and simply by comparing whether two equivalence relations are equal, not computing the intersection of equivalence class as usual. Secondary clustering result is based on fuzzy granularity. The objects of fuzzy clustering are not original data, but some larger granules (initial clusters). High dimensional sparse data is effectively compressed and expressed as sparse feature vector whose dimension is far lower than the dimension of original data. As a result, our approach can handle an array of vastly high dimensional sparse data with high efficiency, and be independent of sequence of the objects.
Keywords :
data handling; fuzzy set theory; pattern clustering; rough set theory; vectors; feature vector; fuzzy clustering; fuzzy granularity; high dimensional sparse data; information granularity-based clustering algorithm; Automation; Clustering algorithms; Data mining; Fuzzy set theory; Information analysis; Machine learning; Quantization; Space technology; Sparse matrices; Telephony; Fuzzy granularity; fuzzy similarity between two initial clusters; integrated approximation rate; sparse feature vector;
Conference_Titel :
Granular Computing, 2005 IEEE International Conference on
Print_ISBN :
0-7803-9017-2
DOI :
10.1109/GRC.2005.1547305