• DocumentCode
    3109694
  • Title

    A web pornography patrol system by content-based analysis: In particular text and image

  • Author

    Polpinij, Jantima ; Sibunruang, Chumsak ; Paungpronpitag, Somnuk ; Chamchong, Rapeeporn ; Chotthanom, Anirut

  • Author_Institution
    Fac. of Inf., Mahasarakham Univ., Maha Sarakham
  • fYear
    2008
  • fDate
    12-15 Oct. 2008
  • Firstpage
    500
  • Lastpage
    505
  • Abstract
    A problem of children being exposed to pornographic Web sites on the Internet has led to their safety issues. To prevent the children from these inappropriate materials, an effective Web filtering system is essential. Content-based Web filtering is one of the important techniques to handle and filter inappropriate information on the web. In this paper, we examine a content-based analysis technique to filter the pornographic Web sites. Then, our system consists of two primary content-based filtering techniques such as text and image. For text analysis, the support vector machine (SVM) algorithm and N-gram model based on Bayes´ theorem is applied and experimented to filter pornographic text for both Thai and English language web sites. Meanwhile, we build and examine an image filtering system with a hierarchical image filtering method. It consists of two main processes such as normalized R/G ratio which is using the pixel ratios (red and green color channels) and human composition matrix (HCM) based on skin detection. The empirical results show that our analysis methods of text and image are more effective for pornographic Web filtering. Finally, we have modeled a pornographic web filter using content-based analysis into our Anti-X system.
  • Keywords
    Internet; Web sites; image colour analysis; security of data; support vector machines; text analysis; Bayes theorem; Internet; N-gram model; Web pornography patrol system; anti-X system; content-based Web filtering; content-based analysis; human composition matrix; pornographic Web sites; support vector machine algorithm; text analysis; Color; Humans; Image analysis; Information filtering; Information filters; Internet; Natural languages; Safety; Support vector machines; Text analysis; Content-based analysis; Image filtering; Pornographic web filtering; Text filtering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems, Man and Cybernetics, 2008. SMC 2008. IEEE International Conference on
  • Conference_Location
    Singapore
  • ISSN
    1062-922X
  • Print_ISBN
    978-1-4244-2383-5
  • Electronic_ISBN
    1062-922X
  • Type

    conf

  • DOI
    10.1109/ICSMC.2008.4811326
  • Filename
    4811326