Title : 
TCBPLK: A New Method of Text Categorization
         
        
        
            Author_Institution : 
Beijing Univ., Beijing
         
        
        
        
        
        
        
            Abstract : 
This paper presents a new text categorization method based on P-L theory and Kohonen network, which called TCBPLK method. The Kohonen network is applied to realizing text categorization, which has a defect of too slowly speed of training. To text vector of high dimension, the defect is more obvious. Even the result of text categorization can not be acquired. The new method establishes vector space model of term weight by the theory of P-L, which enhances the function of the words from the viewpoint of categorization effect, and decreases the dimension of vector through eliminating redundant features. Experimental results confirm that TCBPLK method decreases the number of vector, and enhances the generalization and precision of text categorization.
         
        
            Keywords : 
text analysis; Kohonen network; P-L theory; text categorization; vector space model; Cybernetics; Functional analysis; Information analysis; Information retrieval; Learning systems; Machine learning; Matrix decomposition; Pattern analysis; Text categorization; Vocabulary; Kohonen network; P-L theory; Text categorization; Vector space model;
         
        
        
        
            Conference_Titel : 
Machine Learning and Cybernetics, 2007 International Conference on
         
        
            Conference_Location : 
Hong Kong
         
        
            Print_ISBN : 
978-1-4244-0973-0
         
        
            Electronic_ISBN : 
978-1-4244-0973-0
         
        
        
            DOI : 
10.1109/ICMLC.2007.4370825