DocumentCode :
1928689
Title :
A Novel Clustering Algorithm for Prefix-Coded Data Stream Based upon Median-Tree
Author :
Feng, Guangsheng ; Wang, Huiqiang ; Zhao, Qian ; Liang, Ying
Author_Institution :
Coll. of Comput. Sci. & Technol., Harbin Eng. Univ., Harbin
fYear :
2008
fDate :
28-29 Jan. 2008
Firstpage :
79
Lastpage :
84
Abstract :
In actual data streams, there are lots of prefix-coded data, which widely existed in applications. What leads to non-ideal performance and clustering result is that the special treatment of these prefix-coded data structure is not considered in traditional clustering algorithm. To deal with this problem, a new concept of median-tree as well as a method of calculating the coding distance is proposed in this paper. Based upon this, a simple algorithm-dfCluster is put forward, which is capable of dealing with the prefix-coded data streams efficiently. Also, the algorithm analysis is presented in depth. At last, the designed experiment demonstrates that dfCluster is more efficient than the naive algorithm to cluster those kinds of data streams, and meanwhile, the performance of our algorithm is not limited by the specified value of k just as in algorithm k-means.
Keywords :
pattern clustering; tree data structures; clustering algorithm; median-tree; prefix-coded data stream; prefix-coded data structure; Algorithm design and analysis; Clustering algorithms; Computer science; Data engineering; Data mining; Educational institutions; Internet; Noise shaping; Partitioning algorithms; Statistics; Clustering; Data Stream; Median-tree;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Internet Computing in Science and Engineering, 2008. ICICSE '08. International Conference on
Conference_Location :
Harbin
Print_ISBN :
978-0-7695-3112-0
Electronic_ISBN :
978-0-7695-3112-0
Type :
conf
DOI :
10.1109/ICICSE.2008.103
Filename :
4548238
Link To Document :
بازگشت