• DocumentCode
    424240
  • Title

    Comparison of machine learning algorithms in Chinese Web filtering

  • Author

    Du, A-Ning ; Fang, Bin-Xing

  • Author_Institution
    Res. Center of Comput. Network & Inf. Security Technol., Harbin Inst. of Technol., China
  • Volume
    4
  • fYear
    2004
  • fDate
    26-29 Aug. 2004
  • Firstpage
    2526
  • Abstract
    Web filtering based on user´s demand has witnessed a booming interest due to the development of Internet In the research community the dominant approach to this problem is based on machine learning algorithms. Web filtering is an inductive process which automatically builds a filter by learning from a set of pre-assigned document and the description of user´s interest, and then uses it to assign unfiltered Web pages. This survey compares four main machine learning algorithms (decision tree, rule induction, Bayesian algorithm and support vector machines) on Chinese web pages set of their filtering effectiveness and computer resources consumed, focusing on the influence of feature set size and training set size. It induces that support vector machines earn high score in Chinese Web filtering applications.
  • Keywords
    Bayes methods; Internet; decision trees; information filters; learning (artificial intelligence); support vector machines; Bayesian algorithm; Chinese Web filtering; Internet; computer resource; decision tree; machine learning algorithm; rule induction; support vector machine; unfiltered Web page; Application software; Bayesian methods; Decision trees; Filtering algorithms; Information filtering; Information filters; Internet; Machine learning algorithms; Support vector machines; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics, 2004. Proceedings of 2004 International Conference on
  • Print_ISBN
    0-7803-8403-2
  • Type

    conf

  • DOI
    10.1109/ICMLC.2004.1382229
  • Filename
    1382229