DocumentCode :
477656
Title :
A Weight Entropy k-Means Algorithm for Clustering Dataset with Mixed Numeric and Categorical Data
Author :
Li, Taoying ; Chen, Yan
Author_Institution :
Sch. of Econ. & Manage., Dalian Maritime Univ., Dalian
Volume :
1
fYear :
2008
fDate :
18-20 Oct. 2008
Firstpage :
36
Lastpage :
41
Abstract :
Traditional k-means algorithm can make the distances of objects in the same cluster as small as possible, but the distances of objects from different clusters are not satisfied efficiently and usually the dataset with mixed numeric and categorical data is not classified correctly. The IWEKM (improved weight entropy k-means) algorithm is proposed in this paper. The proposed algorithm overcomes the above problems by modifying the cost function of entropy weighting k-means clustering algorithm by adding a variable that is relevant linearly to the square sum of distances from the mean of all objects and the means of all clusters and a variable that is relevant to relativity degree of categorical data. The results of different clustering algorithms applied on Iris data and Flag data show that the proposed algorithm is efficient.
Keywords :
entropy; pattern clustering; Flag data; IWEKM; Iris data; categorical data; cost function; numeric data; weight entropy k-means algorithm; Clustering algorithms; Conference management; Cost function; Entropy; Fuzzy systems; Iris; Knowledge management; Partitioning algorithms; Utility programs; clustering; k-means algorithm; partition clustering; weight entropy;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery, 2008. FSKD '08. Fifth International Conference on
Conference_Location :
Shandong
Print_ISBN :
978-0-7695-3305-6
Type :
conf
DOI :
10.1109/FSKD.2008.32
Filename :
4665935
Link To Document :
بازگشت