• DocumentCode
    244908
  • Title

    Diverse Power Iteration Embeddings and Its Applications

  • Author

    Hao Huang ; Shinjae Yoo ; Dantong Yu ; Hong Qin

  • fYear
    2014
  • fDate
    14-17 Dec. 2014
  • Firstpage
    200
  • Lastpage
    209
  • Abstract
    Spectral Embedding is one of the most effective dimension reduction algorithms in data mining. However, its computation complexity has to be mitigated in order to apply it for real-world large scale data analysis. Many researches have been focusing on developing approximate spectral embeddings which are more efficient, but meanwhile far less effective. This paper proposes Diverse Power Iteration Embeddings (DPIE), which not only retains the similar efficiency of power iteration methods but also produces a series of diverse and more effective embedding vectors. We test this novel method by applying it to various data mining applications (e.g. Clustering, anomaly detection and feature selection) and evaluating their performance improvements. The experimental results show our proposed DPIE is more effective than popular spectral approximation methods, and obtains the similar quality of classic spectral embedding derived from eigen-decompositions. Moreover it is extremely fast on big data applications. For example in terms of clustering result, DPIE achieves as good as 95% of classic spectral clustering on the complex datasets but 4000+ times faster in limited memory environment.
  • Keywords
    Big Data; computational complexity; data analysis; data mining; eigenvalues and eigenfunctions; pattern clustering; vectors; DPIE; approximate spectral embeddings; big data applications; computation complexity; data mining; dimension reduction algorithms; diverse power iteration embeddings; eigendecompositions; embedding vectors; large scale data analysis; spectral clustering; Approximation algorithms; Clustering algorithms; Complexity theory; Data mining; Eigenvalues and eigenfunctions; Equations; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining (ICDM), 2014 IEEE International Conference on
  • Conference_Location
    Shenzhen
  • ISSN
    1550-4786
  • Print_ISBN
    978-1-4799-4303-6
  • Type

    conf

  • DOI
    10.1109/ICDM.2014.87
  • Filename
    7023337