DocumentCode :
3179603
Title :
Quick induction of NNTrees for text categorization based on discriminative multiple centroid approach
Author :
Hayashi, Hirotomo ; Zhao, Qiangfu
Author_Institution :
Dept. of Comput. & Inf. Syst., Univ. of Aizu, Aizu-Wakamatsu, Japan
fYear :
2010
fDate :
10-13 Oct. 2010
Firstpage :
705
Lastpage :
712
Abstract :
Neural network tree (NNTree) is a hybrid model for machine learning. So far, we have proposed an efficient algorithm for inducing NNTrees, and verified through experiments that NNTrees are efficient and effective for solving different pattern recognition problems. However, for problems like text categorization, induction of NNTrees can be very computationally expensive. To solve this problem, we have tried to induce NNTrees after dimensionality reduction. Specifically, we have studied the linear discriminant analysis (LDA) based approach, the principal component analysis (PCA) based approach, and the direct centroid (DC) based approach. Results show that DC is simple but not effective; and LDA performs better but the computational cost for finding the transformation matrix is very high. To solve the problem more efficiently, we propose in this paper the discriminant multiple centroid (DMC) approach. Actually, DMC is a two-stage approach, in which all data are first mapped to a lower dimensional space based on the centroids, and the LDA is then conducted in the mapped space. Experimental results obtained for three public text datasets show that in all cases DMC is much faster than LDA without significant degradation.
Keywords :
decision trees; learning (artificial intelligence); matrix algebra; neural nets; pattern classification; principal component analysis; text analysis; NNTree; computational cost; direct centroid based approach; discriminative multiple centroid approach; hybrid learning model; linear discriminant analysis; machine learning; neural network tree; pattern recognition; principal component analysis; public text dataset; quick induction; text categorization; transformation matrix; Artificial neural networks; Pattern recognition; decision tree; dimensionality reduction; neural network; text categorization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems Man and Cybernetics (SMC), 2010 IEEE International Conference on
Conference_Location :
Istanbul
ISSN :
1062-922X
Print_ISBN :
978-1-4244-6586-6
Type :
conf
DOI :
10.1109/ICSMC.2010.5641834
Filename :
5641834
Link To Document :
بازگشت