مرکز منطقه ای اطلاع رساني علوم و فناوري - Hierarchical Clustering Using Homogeneity as Similarity Measure for Big Data Analytics

DocumentCode :

1669767

Title :

Hierarchical Clustering Using Homogeneity as Similarity Measure for Big Data Analytics

Author :

Yunwei Zhao ; Chi-Hung Chi ; Chen Ding ; Wong, Raymond ; Wei Zhao ; Can Wang

Author_Institution :

Sch. of Software, Tsinghua Univ., Beijing, China

fYear :

2015

Firstpage :

348

Lastpage :

354

Abstract :

In big data analytics, clustering plays a fundamental and decisive role in supporting pattern mining and value creation. To help improve user experience and satisfaction level of clustering algorithms, one important key is to let users define the quality of the aggregated clusters (e.g. In terms of the homogeneity and the relative population of each resulting cluster) they prefer instead of to fix the number of clusters to be obtained before the clustering process. In this paper, we first propose a new measure, called the Clustering Performance Index (or CPI), that takes into consideration of homogeneity, relative population, and number of clusters aggregated. Then we propose a new hierarchical clustering algorithm by adopting homogeneity as its key similarity. Experimental results show that our proposed clustering algorithm can achieve a good balance among CPI, the number of clusters aggregated, and the time cost of the algorithm.

Keywords :

Big Data; data analysis; data mining; pattern clustering; CPI; big data analytics; clustering performance index; hierarchical clustering; pattern mining; similarity measure; value creation; Algorithm design and analysis; Australia; Clustering algorithms; Indexes; Performance analysis; Sociology; Statistics; clustering; clustering performance index; homogeneity; relative population;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Services Computing (SCC), 2015 IEEE International Conference on

Conference_Location :

New York, NY

Print_ISBN :

978-1-4673-7280-0

Type :

conf

DOI :

10.1109/SCC.2015.55

Filename :

7207373

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1669767