• DocumentCode
    2488107
  • Title

    Artificial Neural Network for Document Classification Using Latent Semantic Indexing

  • Author

    Li, Cheng Hua ; Park, Soon Cheol

  • Author_Institution
    Chonbuk Nat. Univ., Jeonju
  • fYear
    2007
  • fDate
    23-24 Nov. 2007
  • Firstpage
    17
  • Lastpage
    21
  • Abstract
    In this study, we construct document classification systems using artificial neural network training by the multi-output perceptron learning algorithm (MOPL) and back-propagation neural network (BPNN). Most classic classification systems represent the contents of documents with a set of index terms, which is termed the vector space model (VSM). However, this method requires a high dimensional space to represent the documents, and it does not take into account the semantic relationship between the terms, which could lead to a poor classification performance. In this paper, we introduce latent semantic indexing (LSI) in our systems. It could not only reduce the dimensionality to a great extent but also determine important associative relationships between the terms. The LSI also aids in accelerating the training speed and improves the classification accuracy. We test our classification systems on the standard Reuter-21578 collection. The experimental evaluations show that the system training with the LSI is considerably faster than the original system training with the VSM and that the former yields better classification results.
  • Keywords
    backpropagation; document handling; indexing; multilayer perceptrons; BPNN; LSI; MOPL; VSM; artificial neural network; back-propagation neural network; document classification; latent semantic indexing; multioutput perceptron learning algorithm; standard Reuter-21578 collection; vector space model; Artificial neural networks; Indexing; Inference algorithms; Information retrieval; Information technology; Internet; Labeling; Large scale integration; Machine learning algorithms; Neural networks;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Technology Convergence, 2007. ISITC 2007. International Symposium on
  • Conference_Location
    Joenju
  • Print_ISBN
    0-7695-3045-1
  • Electronic_ISBN
    978-0-7695-3045-1
  • Type

    conf

  • DOI
    10.1109/ISITC.2007.69
  • Filename
    4410598