DocumentCode :
436137
Title :
Genetic uxtraction of text category descriptions
Author :
Serrano, J.I. ; del Castillo, M.D.
Author_Institution :
Instituto de Automatica Industrial, CSIC. Ctra. Campo Real, km. 0.200. La Poveda, Arganda del Rey, 28500 Madrid, SPAIN
Volume :
16
fYear :
2004
fDate :
June 28 2004-July 1 2004
Firstpage :
7
Lastpage :
12
Abstract :
This paper deals with a supervised learning method devoted to producing categorization models of text documents. The goal of the method is to use a suitable numerical measurement of example similarity to find centroids describing different categories of examples. The centroids are neither abstract nor statistical models, but rather consist of bits of examples. The centroid-learning method is based on a genetic algorithm, the GAT. The categorization system infers a model by applying the GAT to the set of preclassified documents. The models thus obtained arc the category centroids that are used to predict the category of a new document.
Keywords :
Genetic algorithms; Humans; Internet; Machine learning; Natural languages; Organizing; Predictive models; Supervised learning; Terminology; Text categorization; centroid; evolutionary learning; similarity function; text classification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Automation Congress, 2004. Proceedings. World
Conference_Location :
Seville
Print_ISBN :
1-889335-21-5
Type :
conf
Filename :
1438624
Link To Document :
بازگشت