مرکز منطقه ای اطلاع رساني علوم و فناوري - Chameleon: hierarchical clustering using dynamic modeling

DocumentCode :

1537381

Title :

Chameleon: hierarchical clustering using dynamic modeling

Author :

Karypis, George ; Han, Eui-Hong ; Kumar, Vipin

Author_Institution :

Dept. of Comput. Sci., Minnesota Univ., Minneapolis, MN, USA

Volume :

Issue :

fYear :

1999

fDate :

8/1/1999 12:00:00 AM

Firstpage :

Lastpage :

Abstract :

Clustering is a discovery process in data mining. It groups a set of data in a way that maximizes the similarity within clusters and minimizes the similarity between two different clusters. Many advanced algorithms have difficulty dealing with highly variable clusters that do not follow a preconceived model. By basing its selections on both interconnectivity and closeness, the Chameleon algorithm yields accurate results for these highly variable clusters. Existing algorithms use a static model of the clusters and do not use information about the nature of individual clusters as they are merged. Furthermore, one set of schemes (the CURE algorithm and related schemes) ignores the information about the aggregate interconnectivity of items in two clusters. Another set of schemes (the Rock algorithm, group averaging method, and related schemes) ignores information about the closeness of two clusters as defined by the similarity of the closest items across two clusters. By considering either interconnectivity or closeness only, these algorithms can select and merge the wrong pair of clusters. Chameleon´s key feature is that it accounts for both interconnectivity and closeness in identifying the most similar pair of clusters. Chameleon finds the clusters in the data set by using a two-phase algorithm. During the first phase, Chameleon uses a graph partitioning algorithm to cluster the data items into several relatively small subclusters. During the second phase, it uses an algorithm to find the genuine clusters by repeatedly combining these subclusters

Keywords :

data analysis; data mining; graph theory; pattern clustering; CURE algorithm; Chameleon algorithm; Rock algorithm; advanced algorithms; aggregate interconnectivity; closeness; closest items; data item clustering; data mining; data set; discovery process; dynamic modeling; graph partitioning algorithm; hierarchical clustering; highly variable clusters; most similar pair; subclusters; two-phase algorithm; Aggregates; Clustering algorithms; Data analysis; Data mining; Earthquakes; Extraterrestrial measurements; Proteins; Seismology; Shape;

fLanguage :

English

Journal_Title :

Computer

Publisher :

ieee

ISSN :

0018-9162

Type :

jour

DOI :

10.1109/2.781637

Filename :

781637

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1537381