Title :
Applying the conjugate gradient method for text document categorization
Author :
Tam, Vincent ; Setiono, Rudy ; Santoso, A.
Author_Institution :
Dept. of Electr. & Electron. Eng., Hong Kong Univ., China
Abstract :
We investigate the effectiveness of two different methods to solve the linear least squares fit (LLSF) problem for document categorization. The first method is the singular value decomposition (SVD) method that has been previously used to solve the document categorization problem. The second method is the conjugate gradient (CG) method that is one of the most effective algorithms for solving a linear equation problem. However, up to our knowledge, the CG method has never been applied to handle the document classification problem. Therefore, we compare the effectiveness of these two LLSF methods to categorize text documents. In addition, we examine the effect of using different term weighting schemes on their performance for document classification. Lastly, we compare the performance of the LLSF classifiers against the neighborhood-based Dt-kNN classifier, our best variant of the kNN classifier integrated with a dynamic threshold scheme, on the Reuters 21578 dataset. Besides being the first proposal to use the CG method for document classification, our work opens up many exciting directions for future investigation.
Keywords :
classification; conjugate gradient methods; document handling; least squares approximations; singular value decomposition; text analysis; Reuters 21578 dataset; conjugate gradient method; document classification problem; dynamic threshold scheme; linear least squares fit problem; neighborhood-based Dt-kNN classifier; singular value decomposition method; text document categorization; Gradient methods; Pattern recognition;
Conference_Titel :
Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on
Print_ISBN :
0-7695-2128-2
DOI :
10.1109/ICPR.2004.1334305