• DocumentCode
    3253709
  • Title

    Daily clustering for the electronic newspaper based on the analysis of trends

  • Author

    Nakashima, Takuo ; Nakamura, Ryozo

  • Author_Institution
    Kumamoto Univ., Japan
  • fYear
    1999
  • fDate
    1999
  • Firstpage
    51
  • Lastpage
    54
  • Abstract
    To classify newspaper articles automatically, the tf*idf method has been used to weight the words in an article. These methods are suitable for fixed databases, but cannot pick up the topic words of articles because the IDF methods give a low value for frequently occurring words. We propose the daily clustering method for electronic daily newspapers. Our method is based on the characteristics of articles and the change of contents. First, we define the weight function of words based on the position in the article and the change rate of content as time passes. Then we calculate the relation between articles, clustering value and the relation between clusters of different days. As a result of experiments, the accuracy of recall and precision rate improved by several percent compared with old methods
  • Keywords
    classification; electronic publishing; information resources; pattern clustering; article characteristics; content change; daily clustering method; electronic newspaper; precision rate; recall; tf*idf method; trend analysis; Clustering methods; Computer networks; Databases; Frequency; Intelligent networks; Intelligent systems;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Communications, Computers and Signal Processing, 1999 IEEE Pacific Rim Conference on
  • Conference_Location
    Victoria, BC
  • Print_ISBN
    0-7803-5582-2
  • Type

    conf

  • DOI
    10.1109/PACRIM.1999.799475
  • Filename
    799475