• DocumentCode
    2727480
  • Title

    Blogosonomy : Autotagging Any Text Using Bloggers´ Knowledge

  • Author

    Fujimura, Shigeru ; Fujimura, Ko ; Okuda, Hidenori

  • Author_Institution
    NTT Corp., Yokohama
  • fYear
    2007
  • fDate
    2-5 Nov. 2007
  • Firstpage
    205
  • Lastpage
    212
  • Abstract
    There are at least three barriers to utilizing blog tags in classification or navigation: 40% of entries are not (from our observations) tagged, there are many orthographic or synonymous tag variations, and not all tags are informative. We propose a method of multi-autotagging, based on k-NN, which is a case-based classification method. Our method also has the functions of merging tags with the same meaning and identifying informative tags. For realizing these functions, we propose the term weighting method named residual document frequency(RDF); it can score the similarity between tags. Experiments show the effectiveness of our methods. Our autotagging system is generic and can assign tag(s) to any text as well as blog entries although the training data is collected from the blogosophere.
  • Keywords
    Internet; classification; text analysis; blogger knowledge; blogosonomy; case-based classification; k-NN classification; multiautotagging; residual document frequency; Frequency; Information services; Internet; Merging; Navigation; Training data; Web sites;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence, IEEE/WIC/ACM International Conference on
  • Conference_Location
    Fremont, CA
  • Print_ISBN
    978-0-7695-3026-0
  • Type

    conf

  • DOI
    10.1109/WI.2007.85
  • Filename
    4427089