Title :
Scaling up discrete distribution clustering using ADMM
Author :
Jianbo Ye ; Jia Li
Author_Institution :
Pennsylvania State Univ., University Park, PA, USA
Abstract :
The discrete distribution as a sparse representation, equipped with the Kantorovich-Wasserstein metric, has been proven effective in learning tasks on imagery data. However, clustering based on the Kantorovich metric under a principled optimization criterion is computationally challenging, and has not been adequately explored. In this paper, we focus on the scalability issue and develop a new algorithm for clustering distributions. An optimal centroid or representative distribution in the sense of the Kantorovich metric is solved for each cluster. The key idea is to adapt the state-of-the-art distributed optimization approach called alternating direction method of multipliers (ADMM). The new algorithm achieves linear complexity in the update of each centroid and can be easily parallelizable, improving significantly over the existing method. It is also observed that in practice, satisfactory results can be obtained after a few tens of iterations. We conduct experiments on both synthetic and real data to demonstrate the computational efficiency and accuracy of the new algorithm.
Keywords :
computational complexity; image representation; learning (artificial intelligence); optimisation; pattern clustering; ADMM; Kantorovich-Wasserstein metric; alternating direction method of multipliers; discrete distribution clustering; distributed optimization approach; imagery data; learning tasks; linear complexity; principled optimization criterion; sparse representation; Clustering algorithms; Complexity theory; Computer vision; Earth; Linear programming; Measurement; Optimization; ADMM; Clustering; Discrete Distribution; Large-Scale Learning;
Conference_Titel :
Image Processing (ICIP), 2014 IEEE International Conference on
Conference_Location :
Paris
DOI :
10.1109/ICIP.2014.7026066