Title :
Feature selection in text categorization using the Baldwin effect
Author :
Yu, Edmund S. ; Liddy, Elizabeth D.
Author_Institution :
CIS, Syracuse Univ., NY, USA
Abstract :
Text categorization is the problem of automatically assigning predefined categories to natural language texts. A major difficulty of this problem stems from the high dimensionality of its feature space. Reducing the dimensionality, or selecting a good subset of features, without sacrificing accuracy, is of great importance for neural networks to be successfully applied to the area. In this paper, we propose a neuro-genetic approach to feature selection in text categorization. Candidate feature subsets are evaluated by using three-layer feedforward neural networks. The Baldwin effect concerns the tradeoffs between learning and evolution. It is used in our research to guide and improve the GA-based evolution of the feature subsets. Experimental results show that our neuro-genetic algorithm is able to perform as well as, if not better than, the best results of neural networks to date, while using fewer input features
Keywords :
feature extraction; feedforward neural nets; genetic algorithms; multilayer perceptrons; text analysis; Baldwin effect; GA-based evolution; dimensionality reduction; feature selection; feature subset selection; learning; natural language texts; neuro-genetic approach; text categorization; three-layer feedforward neural networks; Computational Intelligence Society; Data mining; Filters; Genetic algorithms; Humans; Indexing; Information retrieval; Natural languages; Neural networks; Text categorization;
Conference_Titel :
Neural Networks, 1999. IJCNN '99. International Joint Conference on
Conference_Location :
Washington, DC
Print_ISBN :
0-7803-5529-6
DOI :
10.1109/IJCNN.1999.833550