DocumentCode :
3253709
Title :
Daily clustering for the electronic newspaper based on the analysis of trends
Author :
Nakashima, Takuo ; Nakamura, Ryozo
Author_Institution :
Kumamoto Univ., Japan
fYear :
1999
fDate :
1999
Firstpage :
51
Lastpage :
54
Abstract :
To classify newspaper articles automatically, the tf*idf method has been used to weight the words in an article. These methods are suitable for fixed databases, but cannot pick up the topic words of articles because the IDF methods give a low value for frequently occurring words. We propose the daily clustering method for electronic daily newspapers. Our method is based on the characteristics of articles and the change of contents. First, we define the weight function of words based on the position in the article and the change rate of content as time passes. Then we calculate the relation between articles, clustering value and the relation between clusters of different days. As a result of experiments, the accuracy of recall and precision rate improved by several percent compared with old methods
Keywords :
classification; electronic publishing; information resources; pattern clustering; article characteristics; content change; daily clustering method; electronic newspaper; precision rate; recall; tf*idf method; trend analysis; Clustering methods; Computer networks; Databases; Frequency; Intelligent networks; Intelligent systems;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Communications, Computers and Signal Processing, 1999 IEEE Pacific Rim Conference on
Conference_Location :
Victoria, BC
Print_ISBN :
0-7803-5582-2
Type :
conf
DOI :
10.1109/PACRIM.1999.799475
Filename :
799475
Link To Document :
بازگشت