• DocumentCode
    3320989
  • Title

    Document Classification with Unsupervised Nonnegative Matrix Factorization and Supervised Percetron Learning

  • Author

    Barman, Paresh C. ; Lee, Soo-Young

  • Author_Institution
    Korea Adv. Inst. of Sci. & Technol., Daejeon
  • fYear
    2007
  • fDate
    8-11 July 2007
  • Firstpage
    182
  • Lastpage
    186
  • Abstract
    A new hybrid neural network model is proposed for the document classification. The NMF-SLP model consists of 2 layers, in which the first non-negative matrix factorization (NMF) layer decomposes a document into several clusters, and the second single-layer-perceptron (SLP) layer classifies the document based on the clusters. The NMF layer is trained by factorizing the document word frequency matrix into feature matrix and coefficient matrix, and then estimating the pseudo-inverse of the feature matrix. The SLP layer is trained by standard error minimization algorithm. Classification performances are investigated as a function of the cluster number, i.e., the number of hidden neurons, and also slope of sigmoidal nonlinearity at the hidden neurons. The developed model demonstrates much better classification accuracy compared to the simple NMF and k-NN classifiers, while standard multi-layer Perceptron is almost impractical to train properly due to high dimensional inputs and large number of adaptive elements.
  • Keywords
    document handling; matrix decomposition; matrix inversion; minimisation; pattern classification; perceptrons; unsupervised learning; NMF layer; NMF-SLP model; SLP layer; document classification; document word frequency matrix factorization; error minimization algorithm; feature matrix pseudoinverse estimation; hybrid neural network model; single-layer-perceptron layer; supervised perceptron learning; unsupervised nonnegative matrix factorization; Cities and towns; Clustering algorithms; Feature extraction; Frequency estimation; Matrix decomposition; Neural networks; Neurons; Supervised learning; Transmission line matrix methods; Unsupervised learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Acquisition, 2007. ICIA '07. International Conference on
  • Conference_Location
    Seogwipo-si
  • Print_ISBN
    1-4244-1220-X
  • Electronic_ISBN
    1-4244-1220-X
  • Type

    conf

  • DOI
    10.1109/ICIA.2007.4295722
  • Filename
    4295722