• DocumentCode
    3071175
  • Title

    Content-Based Text Classifiers for Pornographic Web Filtering

  • Author

    Polpinij, Jantima ; Chotthanom, Anirut ; Sibunruang, Chumsak ; Chamchong, Rapeeporn ; Puangpronpitag, Somnuk

  • Author_Institution
    Mahasarakham Univ., Mahasarakham
  • Volume
    2
  • fYear
    2006
  • fDate
    8-11 Oct. 2006
  • Firstpage
    1481
  • Lastpage
    1485
  • Abstract
    Due to the flood of pornographic web sites on the internet, effective Web filtering systems are essential. Web filtering based on content has become one of the important techniques to handle and filter inappropriate information on the web. We examine two machine learning algorithms (support vector machines and Naive Bayes) for pornographic web filtering based on text content. We then focus initially on Thai-language and English-language web sites. In this paper, we aim to investigate whether machine learning algorithms are suitable for web sites classification. The empirical results show that the classifier based support vector machines are more effective for pornographic web filtering than Naive Bayes classifier after testing, especially an effectiveness for the over-blocking problem.
  • Keywords
    Bayes methods; Internet; content-based retrieval; information filtering; learning (artificial intelligence); natural languages; support vector machines; text analysis; Internet; Naive Bayes classifier; content-based text classifier; machine learning algorithm; pornographic Web filtering system; pornographic Web site; support vector machine; Cybernetics; Information filtering; Information filters; Internet; Machine learning algorithms; Support vector machine classification; Support vector machines; Text categorization; Uniform resource locators; Web pages; Naïve Bayes; Pornographic web filtering; Support Vector Machines; Text Classification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems, Man and Cybernetics, 2006. SMC '06. IEEE International Conference on
  • Conference_Location
    Taipei
  • Print_ISBN
    1-4244-0099-6
  • Electronic_ISBN
    1-4244-0100-3
  • Type

    conf

  • DOI
    10.1109/ICSMC.2006.384926
  • Filename
    4274060