Title :
A Chinese text classification model based on vector space and semantic meaning
Author :
Wang, Bao-Yi ; Zhang, Shao-Min
Author_Institution :
Sch. of Comput., North China Electr. Power Univ., Baoding, China
Abstract :
Aiming at the status that various electronic text materials are increasing rapidly, This work brings forward a model of automatic classification of electronic text information in order to manage and use these text information effectively: the algorithm of segmentation of word based on word dictionary and statistics, preprocessing of text, design of weight function of feature words and collecting them, expression of text vector space, latent semantic indexing and clustering algorithm of text, etc. The experiment has proved that the model had satisfactory classification effect as well as high calculation and storage efficiency.
Keywords :
character recognition; natural languages; pattern classification; pattern clustering; semantic networks; text analysis; Chinese text classification model; electronic text information; semantic indexing; semantic meaning; text clustering algorithm; text preprocessing; text vector space; vector space; word segmentation; Algorithm design and analysis; Clustering algorithms; Dictionaries; Electronic mail; Energy management; Indexing; Natural language processing; Natural languages; Statistics; Text categorization;
Conference_Titel :
Machine Learning and Cybernetics, 2004. Proceedings of 2004 International Conference on
Print_ISBN :
0-7803-8403-2
DOI :
10.1109/ICMLC.2004.1382361