Title :
Quick induction of NNTrees for text categorization based on discriminative multiple centroid approach
Author :
Hayashi, Hirotomo ; Zhao, Qiangfu
Author_Institution :
Dept. of Comput. & Inf. Syst., Univ. of Aizu, Aizu-Wakamatsu, Japan
Abstract :
Neural network tree (NNTree) is a hybrid model for machine learning. So far, we have proposed an efficient algorithm for inducing NNTrees, and verified through experiments that NNTrees are efficient and effective for solving different pattern recognition problems. However, for problems like text categorization, induction of NNTrees can be very computationally expensive. To solve this problem, we have tried to induce NNTrees after dimensionality reduction. Specifically, we have studied the linear discriminant analysis (LDA) based approach, the principal component analysis (PCA) based approach, and the direct centroid (DC) based approach. Results show that DC is simple but not effective; and LDA performs better but the computational cost for finding the transformation matrix is very high. To solve the problem more efficiently, we propose in this paper the discriminant multiple centroid (DMC) approach. Actually, DMC is a two-stage approach, in which all data are first mapped to a lower dimensional space based on the centroids, and the LDA is then conducted in the mapped space. Experimental results obtained for three public text datasets show that in all cases DMC is much faster than LDA without significant degradation.
Keywords :
decision trees; learning (artificial intelligence); matrix algebra; neural nets; pattern classification; principal component analysis; text analysis; NNTree; computational cost; direct centroid based approach; discriminative multiple centroid approach; hybrid learning model; linear discriminant analysis; machine learning; neural network tree; pattern recognition; principal component analysis; public text dataset; quick induction; text categorization; transformation matrix; Artificial neural networks; Pattern recognition; decision tree; dimensionality reduction; neural network; text categorization;
Conference_Titel :
Systems Man and Cybernetics (SMC), 2010 IEEE International Conference on
Conference_Location :
Istanbul
Print_ISBN :
978-1-4244-6586-6
DOI :
10.1109/ICSMC.2010.5641834