Eliminating Error Accumulation in Hierarchical Clustering Algorithms

Author

Yanan Jin ; Fei Xiao

Author_Institution

Sch. of Inf. Manage., Hubei Univ. of Econ., Wuhan, China

fYear

2013

fDate

9-11 Sept. 2013

Firstpage

639

Lastpage

642

Abstract

Hierarchical agglomerative clustering treats given data as a singleton cluster at the outset and then successively merge (or agglomerate) pairs of clusters until all clusters have been merged into a single cluster that contains all data. However, if two data are merged incorrectly in the beginning, errors will be accumulated and amplified by the following iterations. Thus, we will get a worse cluster. In this paper, we propose an adaptive hierarchical agglomerative clustering algorithm called Agglomerative Network Clustering Algorithm (ANCA) adapted from Newman Rapid Algorithm Based on Heap (NRABH) to eliminate error accumulation in advance. It avoids the errors by re-computing the increment modularity to find the correct nodes that should be merged. The experiments show that the proposed algorithm avoids the accumulation of error and gets a better result.

Keywords

merging; pattern clustering; ANCA; NRABH; Newman rapid algorithm based on heap; adaptive hierarchical agglomerative clustering algorithm; agglomerative network clustering algorithm; cluster data merging; error accumulation elimination; singleton cluster; Algorithm design and analysis; Clustering algorithms; Communities; Computers; Merging; Partitioning algorithms; Sparse matrices; Agglomerative; Error avoiding; Hierarchical clustering; Network clustering;

fLanguage

English

Publisher

ieee

Conference_Titel

Emerging Intelligent Data and Web Technologies (EIDWT), 2013 Fourth International Conference on

Conference_Location

Xi´an

Print_ISBN

978-1-4799-2140-9

Type

conf

DOI

10.1109/EIDWT.2013.115

Filename

6631693