DocumentCode :
3031364
Title :
Creating Streaming Iterative Soft Clustering Algorithms
Author :
Hore, Prodip ; Hall, Lawrence O. ; Goldgof, Dmitry B.
Author_Institution :
Univ. of South Florida, Tampa
fYear :
2007
fDate :
24-27 June 2007
Firstpage :
484
Lastpage :
488
Abstract :
There are an increasing number of large labeled and unlabeled data sets available. Clustering algorithms are the best suited for helping one make sense out of unlabeled data. However, scaling iterative clustering algorithms to large amounts of data has been a challenge. The computation time can be very great and for data sets that will not fit in even the largest memory, only carefully chosen subsets of data can be practically clustered. We present a general approach which enables iterative fuzzy/possibilistic clustering algorithms to be turned into algorithms that can handle arbitrary amounts of streaming data. The computation time is also reduced for very large data sets while the results of clustering will be very similar to clustering with all the data, if that was possible. We introduce transformed equations for fuzzy-C-means, possibilistic C-means, the Gustafson-Kessel algorithm and show the excellent performance with a streaming fuzzy C-means implementation. The resulting clusters are both sensible and for comparable data sets (those that fit in memory) almost identical to those obtained with the original clustering algorithm.
Keywords :
fuzzy logic; iterative methods; pattern clustering; possibility theory; Gustafson-Kessel algorithm; fuzzy-C-means; iterative fuzzy-possibilistic clustering algorithms; possibilistic C-means; streaming iterative soft clustering algorithm; unlabeled data set; Clustering algorithms; Computer science; Equations; Fuzzy sets; Iterative algorithms; Iterative methods; Labeling; Partitioning algorithms; Sampling methods; Wrapping; clustering; fuzzy; possibilistic; scalable; streaming;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Information Processing Society, 2007. NAFIPS '07. Annual Meeting of the North American
Conference_Location :
San Diego, CA
Print_ISBN :
1-4244-1213-7
Electronic_ISBN :
1-4244-1214-5
Type :
conf
DOI :
10.1109/NAFIPS.2007.383888
Filename :
4271111
Link To Document :
بازگشت