• DocumentCode
    545347
  • Title

    Machine learning based blog classification personal vs. official facet

  • Author

    Sun, Xueji ; Li, Si ; Xu, Weiran ; Chen, Guang ; Guo, Jun

  • Author_Institution
    Sch. of Inf. & Commun. Eng., Beijing Univ. of Posts & Telecommun., Beijing, China
  • Volume
    1
  • fYear
    2011
  • fDate
    11-13 March 2011
  • Firstpage
    31
  • Lastpage
    34
  • Abstract
    Since the blog service brings a wealth of information resources, blog search and classification are showing their great research value. This paper focuses on the blog classification on the personal vs. official facet. Our system adopts a two-stage strategy; in training model, lexicons are built automatically; in classification model, scoring and ranking are carried out orderly. Our experimental results reveal that feature selection, Mutual Information weighting are good for lexicons with significant results. However, sentiment words can only slightly improve the results.
  • Keywords
    Web services; Web sites; feature extraction; learning (artificial intelligence); pattern classification; blog classification; blog search; blog service; classification model; feature selection; information resource; machine learning; mutual information weighting; sentiment word; training model; Blogs; Buildings; Machine learning; Measurement; Mutual information; Testing; Training; Blog Classification; Feature selection; Lexicons; Machine Learning; Sentiment;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Research and Development (ICCRD), 2011 3rd International Conference on
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-1-61284-839-6
  • Type

    conf

  • DOI
    10.1109/ICCRD.2011.5763967
  • Filename
    5763967