• DocumentCode
    3105967
  • Title

    A Novel Scalable Algorithm for Supervised Subspace Learning

  • Author

    Yan, Jun ; Liu, Ning ; Zhang, Benyu ; Yang, Qiang ; Yan, Shuicheng ; Chen, Zheng

  • Author_Institution
    Microsoft Res. Asia, Beijing
  • fYear
    2006
  • fDate
    18-22 Dec. 2006
  • Firstpage
    721
  • Lastpage
    730
  • Abstract
    Subspace learning approaches aim to discover important statistical distribution on lower dimensions for high dimensional data. Methods such as principal component analysis (PCA) do not make use of the class information, and linear discriminant analysis (LDA) could not be performed efficiently in a scalable way. In this paper, we propose a novel highly scalable supervised subspace learning algorithm called as supervised Kampong measure (SKM). It assigns data points as close as possible to their corresponding class mean, simultaneously assigns data points to be as far as possible from the other class means in the transformed lower dimensional subspace. Theoretical derivation shows that our algorithm is not limited by the number of classes or the singularity problem faced by LDA. Furthermore, our algorithm can be executed in an incremental manner in which learning is done in an online fashion as data streams are received. Experimental results on several datasets, including a very large text data set RCV1, show the outstanding performance of our proposed algorithm on classification problems as compared to PCA, LDA and a popular feature selection approach, information gain (IG).
  • Keywords
    learning (artificial intelligence); principal component analysis; statistical distributions; linear discriminant analysis; principal component analysis; scalable algorithm; singularity problem; statistical distribution; supervised Kampong measure; supervised subspace learning; Asia; Classification algorithms; Clustering algorithms; Computational complexity; Computer science; Linear discriminant analysis; Machine learning; Machine learning algorithms; Principal component analysis; Statistical distributions;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2006. ICDM '06. Sixth International Conference on
  • Conference_Location
    Hong Kong
  • ISSN
    1550-4786
  • Print_ISBN
    0-7695-2701-7
  • Type

    conf

  • DOI
    10.1109/ICDM.2006.7
  • Filename
    4053097