DocumentCode :
2472449
Title :
K-means clustering of proportional data using L1 distance
Author :
Kashima, Hisashi ; Hu, Jianying ; Ray, Bonnie ; Singh, Moninder
Author_Institution :
IBM Tokyo Res. Lab., Yamato, Japan
fYear :
2008
fDate :
8-11 Dec. 2008
Firstpage :
1
Lastpage :
4
Abstract :
We present a new L1-distance-based k-means clustering algorithm to address the challenge of clustering high-dimensional proportional vectors. The new algorithm explicitly incorporates proportionality constraints in the computation of the cluster centroids, resulting in reduced L1 error rates. We compare the new method to two competing methods, an approximate L1-distance k-means algorithm, where the centroid is estimated using cluster means, and a median L1 k-means algorithm, where the centroid is estimated using cluster medians, with proportionality constraints imposed by normalization in a second step. Application to clustering of projects based on distribution of labor hours by skill illustrates the advantages of the new algorithm.
Keywords :
pattern clustering; L1-distance-based k-means clustering algorithm; high-dimensional proportional vectors; median L1 k-means algorithm; proportional data; Closed-form solution; Clustering algorithms; Costs; Distortion measurement; Error analysis; Euclidean distance; Gaussian distribution; Laboratories; Partitioning algorithms; Project management;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition, 2008. ICPR 2008. 19th International Conference on
Conference_Location :
Tampa, FL
ISSN :
1051-4651
Print_ISBN :
978-1-4244-2174-9
Electronic_ISBN :
1051-4651
Type :
conf
DOI :
10.1109/ICPR.2008.4760982
Filename :
4760982
Link To Document :
بازگشت