DocumentCode
2488107
Title
Artificial Neural Network for Document Classification Using Latent Semantic Indexing
Author
Li, Cheng Hua ; Park, Soon Cheol
Author_Institution
Chonbuk Nat. Univ., Jeonju
fYear
2007
fDate
23-24 Nov. 2007
Firstpage
17
Lastpage
21
Abstract
In this study, we construct document classification systems using artificial neural network training by the multi-output perceptron learning algorithm (MOPL) and back-propagation neural network (BPNN). Most classic classification systems represent the contents of documents with a set of index terms, which is termed the vector space model (VSM). However, this method requires a high dimensional space to represent the documents, and it does not take into account the semantic relationship between the terms, which could lead to a poor classification performance. In this paper, we introduce latent semantic indexing (LSI) in our systems. It could not only reduce the dimensionality to a great extent but also determine important associative relationships between the terms. The LSI also aids in accelerating the training speed and improves the classification accuracy. We test our classification systems on the standard Reuter-21578 collection. The experimental evaluations show that the system training with the LSI is considerably faster than the original system training with the VSM and that the former yields better classification results.
Keywords
backpropagation; document handling; indexing; multilayer perceptrons; BPNN; LSI; MOPL; VSM; artificial neural network; back-propagation neural network; document classification; latent semantic indexing; multioutput perceptron learning algorithm; standard Reuter-21578 collection; vector space model; Artificial neural networks; Indexing; Inference algorithms; Information retrieval; Information technology; Internet; Labeling; Large scale integration; Machine learning algorithms; Neural networks;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Technology Convergence, 2007. ISITC 2007. International Symposium on
Conference_Location
Joenju
Print_ISBN
0-7695-3045-1
Electronic_ISBN
978-0-7695-3045-1
Type
conf
DOI
10.1109/ISITC.2007.69
Filename
4410598
Link To Document