• DocumentCode
    226875
  • Title

    Stochastic gradient descent based fuzzy clustering for large data

  • Author

    Yangtao Wang ; Lihui Chen ; Jian-Ping Mei

  • Author_Institution
    Sch. of Electr. & Electron. Eng., Nanyang Technol. Univ., Singapore, Singapore
  • fYear
    2014
  • fDate
    6-11 July 2014
  • Firstpage
    2511
  • Lastpage
    2518
  • Abstract
    Data is growing at an unprecedented rate in commercial and scientific areas. Clustering algorithms for large data which require small memory consumption and scalability become increasingly important under this circumstance. In this paper, we propose a new clustering approach called stochastic gradient based fuzzy clustering(SGFC) which achieves the optimization based on stochastic approximation to handle such kind of large data. We derive an adaptive learning rate which can be updated incrementally and maintained automatically in gradient descent approach employed in SGFC. Moreover, SGFC is extended to a mini-batch SGFC to reduce the stochastic noise. Additionally, multi-pass SGFC is also proposed to improve the clustering performance. Experiments have been conducted on synthetic data to show the effectiveness of our derived adaptive learning rate. Experimental studies have been also conducted on several large benchmark datasets including real world image and document datasets. Compared with existing fuzzy clustering approaches for large data, the mini-batch SGFC shows comparable or better accuracy with significant less time consumption. These results demonstrate the great potential of SGFC for large data analysis.
  • Keywords
    data analysis; fuzzy set theory; gradient methods; learning (artificial intelligence); pattern clustering; SGFC approach; adaptive learning rate; clustering algorithm; clustering performance; document dataset; fuzzy clustering; image dataset; large data analysis; minibatch SGFC; multipass SGFC; stochastic approximation; stochastic gradient based fuzzy clustering; stochastic gradient descent; synthetic data; Algorithm design and analysis; Clustering algorithms; Educational institutions; Equations; Mathematical model; Memory management; Noise;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems (FUZZ-IEEE), 2014 IEEE International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4799-2073-0
  • Type

    conf

  • DOI
    10.1109/FUZZ-IEEE.2014.6891755
  • Filename
    6891755