DocumentCode
2541309
Title
A fuzzy c means variant for clustering evolving data streams
Author
Hore, Prodip ; Hall, Lawrence O. ; Goldgof, Dmitry B.
Author_Institution
Univ. of South Florida, Tampa
fYear
2007
fDate
7-10 Oct. 2007
Firstpage
360
Lastpage
365
Abstract
Clustering algorithms for streaming data sets are gaining importance due to the availability of large data streams from different sources. Recently a number of streaming algorithms have been proposed using crisp algorithms such as hard c means or its variants. The crisp cases may not be easily generalized to fuzzy cases as these two groups of algorithms try to optimize different objective functions. In this paper we propose a streaming variant of the fuzzy c means algorithm. At any stage during processing, a good streaming algorithm should be able to summarize data seen so far and also respond to evolving distributions. We study the tradeoff involved between summarization of data seen and response to an evolving distribution by varying the amount of history used by a streaming algorithm. Empirical evaluation of the performance of our algorithm using both artificial and real data sets under a noisy setting shows its effectiveness.
Keywords
fuzzy set theory; pattern clustering; crisp algorithms; data sets; data streams; fuzzy c means variant; hard c means; objective functions; Clustering algorithms; Fuzzy sets; History; Monitoring; Statistical distributions; Telephony;
fLanguage
English
Publisher
ieee
Conference_Titel
Systems, Man and Cybernetics, 2007. ISIC. IEEE International Conference on
Conference_Location
Montreal, Que.
Print_ISBN
978-1-4244-0990-7
Electronic_ISBN
978-1-4244-0991-4
Type
conf
DOI
10.1109/ICSMC.2007.4413710
Filename
4413710
Link To Document