• DocumentCode
    3228600
  • Title

    Automatic Identification of Chinese Weblogger´s Interests Based on Text Classification

  • Author

    Ni, Xiaochuan ; Wu, Xiaoyuan ; Yu, Yong

  • Author_Institution
    Shanghai Jiao Tong Univ.
  • fYear
    2006
  • fDate
    18-22 Dec. 2006
  • Firstpage
    247
  • Lastpage
    253
  • Abstract
    Chinese Weblogs have been expanded in an incredible speed in recent years. There is plentiful personal information in Weblogs. In this paper, we propose a text classification based approach to automatically identify the interests of a Weblogger. To solve the problems arising out of class Weblog documents, the technique of heterogeneous classifiers combination is used. We also use hierarchical classification technique to identify much specific interests. Experiments show that our interest identification approach has a high accuracy and, for most Webloggers in our experiments, their interests implied in the contents of blogs could be well identified by using this approach
  • Keywords
    Web sites; text analysis; Chinese Weblog document; hierarchical text classification technique; Blogs; Filtering; Support vector machine classification; Support vector machines; Text categorization; Voting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence, 2006. WI 2006. IEEE/WIC/ACM International Conference on
  • Conference_Location
    Hong Kong
  • Print_ISBN
    0-7695-2747-7
  • Type

    conf

  • DOI
    10.1109/WI.2006.47
  • Filename
    4061373