• DocumentCode
    2541309
  • Title

    A fuzzy c means variant for clustering evolving data streams

  • Author

    Hore, Prodip ; Hall, Lawrence O. ; Goldgof, Dmitry B.

  • Author_Institution
    Univ. of South Florida, Tampa
  • fYear
    2007
  • fDate
    7-10 Oct. 2007
  • Firstpage
    360
  • Lastpage
    365
  • Abstract
    Clustering algorithms for streaming data sets are gaining importance due to the availability of large data streams from different sources. Recently a number of streaming algorithms have been proposed using crisp algorithms such as hard c means or its variants. The crisp cases may not be easily generalized to fuzzy cases as these two groups of algorithms try to optimize different objective functions. In this paper we propose a streaming variant of the fuzzy c means algorithm. At any stage during processing, a good streaming algorithm should be able to summarize data seen so far and also respond to evolving distributions. We study the tradeoff involved between summarization of data seen and response to an evolving distribution by varying the amount of history used by a streaming algorithm. Empirical evaluation of the performance of our algorithm using both artificial and real data sets under a noisy setting shows its effectiveness.
  • Keywords
    fuzzy set theory; pattern clustering; crisp algorithms; data sets; data streams; fuzzy c means variant; hard c means; objective functions; Clustering algorithms; Fuzzy sets; History; Monitoring; Statistical distributions; Telephony;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems, Man and Cybernetics, 2007. ISIC. IEEE International Conference on
  • Conference_Location
    Montreal, Que.
  • Print_ISBN
    978-1-4244-0990-7
  • Electronic_ISBN
    978-1-4244-0991-4
  • Type

    conf

  • DOI
    10.1109/ICSMC.2007.4413710
  • Filename
    4413710