Title :
HClustream: A Novel Approach for Clustering Evolving Heterogeneous Data Stream
Author :
Yang, Chunyu ; Zhou, Jie
Author_Institution :
Dept. of Autom., Tsinghua Univ., Beijing
Abstract :
Recently, the continuously arriving and evolving data stream has become a common phenomenon in many fields, such as sensor networks, Web click stream and Internet traffic flow. One of the most important mining tasks is clustering. Clustering has attracted extensive research by both the community of machine learning and data mining. Many stream clustering methods have been proposed. These methods have proven to be efficient on specific problems. However, most of these methods are on continuous clustering and few of them are about to solve the heterogeneous clustering problems. In this paper, we propose a novel approach based on the CluStream framework for clustering data stream with heterogeneous features. The centroid of continuous attributes and the histogram of the discrete attributes are used to represent the micro clusters, and k-prototype clustering algorithm is used to create the micro clusters and macro clusters. Experimental results on both synthetic and real data sets show its efficiency
Keywords :
pattern clustering; HClustream; continuous attributes; continuously arriving data stream; discrete attributes; evolving heterogeneous data stream; heterogeneous clustering problems; k-prototype clustering algorithm; macroclusters; microclusters; stream clustering; Algorithm design and analysis; Clustering algorithms; Data mining; Databases; Decision support systems; Distance measurement; IP networks; Machine learning; Sensor phenomena and characterization; Telecommunication traffic;
Conference_Titel :
Data Mining Workshops, 2006. ICDM Workshops 2006. Sixth IEEE International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
0-7695-2702-7
DOI :
10.1109/ICDMW.2006.89