Title :
A web pornography patrol system by content-based analysis: In particular text and image
Author :
Polpinij, Jantima ; Sibunruang, Chumsak ; Paungpronpitag, Somnuk ; Chamchong, Rapeeporn ; Chotthanom, Anirut
Author_Institution :
Fac. of Inf., Mahasarakham Univ., Maha Sarakham
Abstract :
A problem of children being exposed to pornographic Web sites on the Internet has led to their safety issues. To prevent the children from these inappropriate materials, an effective Web filtering system is essential. Content-based Web filtering is one of the important techniques to handle and filter inappropriate information on the web. In this paper, we examine a content-based analysis technique to filter the pornographic Web sites. Then, our system consists of two primary content-based filtering techniques such as text and image. For text analysis, the support vector machine (SVM) algorithm and N-gram model based on Bayes´ theorem is applied and experimented to filter pornographic text for both Thai and English language web sites. Meanwhile, we build and examine an image filtering system with a hierarchical image filtering method. It consists of two main processes such as normalized R/G ratio which is using the pixel ratios (red and green color channels) and human composition matrix (HCM) based on skin detection. The empirical results show that our analysis methods of text and image are more effective for pornographic Web filtering. Finally, we have modeled a pornographic web filter using content-based analysis into our Anti-X system.
Keywords :
Internet; Web sites; image colour analysis; security of data; support vector machines; text analysis; Bayes theorem; Internet; N-gram model; Web pornography patrol system; anti-X system; content-based Web filtering; content-based analysis; human composition matrix; pornographic Web sites; support vector machine algorithm; text analysis; Color; Humans; Image analysis; Information filtering; Information filters; Internet; Natural languages; Safety; Support vector machines; Text analysis; Content-based analysis; Image filtering; Pornographic web filtering; Text filtering;
Conference_Titel :
Systems, Man and Cybernetics, 2008. SMC 2008. IEEE International Conference on
Conference_Location :
Singapore
Print_ISBN :
978-1-4244-2383-5
Electronic_ISBN :
1062-922X
DOI :
10.1109/ICSMC.2008.4811326