Title :
Extracting characteristic words of text using neural networks
Author :
Saito, Kazumi ; Nakano, Ryohei
Author_Institution :
NTT Commun. Sci. Lab., Kyoto, Japan
Abstract :
In this paper, we consider models for estimating categories of documents and extracting characteristic words of such categories. To this end, we focus on three models, i.e., naive Bayes and two types of neural networks formalized as statistical models. Here, suitable categories of documents are estimated based on posterior probabilities, and characteristic words are extracted based on the magnitude of resulting parameter values. In our experiments using a set of real Web pages, we compare these models in the aspect of categorization performances and extraction capabilities of characteristic words.
Keywords :
Bayes methods; neural nets; probability; statistical analysis; word processing; characteristic words extraction; naive Bayes; neural networks; posterior probabilities; real Web pages; statistical models; Electronic mail; Frequency; Laboratories; Machine learning algorithms; Neural networks; Probability; Support vector machine classification; Support vector machines; Text mining; Web pages;
Conference_Titel :
Neural Networks, 2004. Proceedings. 2004 IEEE International Joint Conference on
Print_ISBN :
0-7803-8359-1
DOI :
10.1109/IJCNN.2004.1380154