DocumentCode
3253709
Title
Daily clustering for the electronic newspaper based on the analysis of trends
Author
Nakashima, Takuo ; Nakamura, Ryozo
Author_Institution
Kumamoto Univ., Japan
fYear
1999
fDate
1999
Firstpage
51
Lastpage
54
Abstract
To classify newspaper articles automatically, the tf*idf method has been used to weight the words in an article. These methods are suitable for fixed databases, but cannot pick up the topic words of articles because the IDF methods give a low value for frequently occurring words. We propose the daily clustering method for electronic daily newspapers. Our method is based on the characteristics of articles and the change of contents. First, we define the weight function of words based on the position in the article and the change rate of content as time passes. Then we calculate the relation between articles, clustering value and the relation between clusters of different days. As a result of experiments, the accuracy of recall and precision rate improved by several percent compared with old methods
Keywords
classification; electronic publishing; information resources; pattern clustering; article characteristics; content change; daily clustering method; electronic newspaper; precision rate; recall; tf*idf method; trend analysis; Clustering methods; Computer networks; Databases; Frequency; Intelligent networks; Intelligent systems;
fLanguage
English
Publisher
ieee
Conference_Titel
Communications, Computers and Signal Processing, 1999 IEEE Pacific Rim Conference on
Conference_Location
Victoria, BC
Print_ISBN
0-7803-5582-2
Type
conf
DOI
10.1109/PACRIM.1999.799475
Filename
799475
Link To Document