DocumentCode
3109694
Title
A web pornography patrol system by content-based analysis: In particular text and image
Author
Polpinij, Jantima ; Sibunruang, Chumsak ; Paungpronpitag, Somnuk ; Chamchong, Rapeeporn ; Chotthanom, Anirut
Author_Institution
Fac. of Inf., Mahasarakham Univ., Maha Sarakham
fYear
2008
fDate
12-15 Oct. 2008
Firstpage
500
Lastpage
505
Abstract
A problem of children being exposed to pornographic Web sites on the Internet has led to their safety issues. To prevent the children from these inappropriate materials, an effective Web filtering system is essential. Content-based Web filtering is one of the important techniques to handle and filter inappropriate information on the web. In this paper, we examine a content-based analysis technique to filter the pornographic Web sites. Then, our system consists of two primary content-based filtering techniques such as text and image. For text analysis, the support vector machine (SVM) algorithm and N-gram model based on Bayes´ theorem is applied and experimented to filter pornographic text for both Thai and English language web sites. Meanwhile, we build and examine an image filtering system with a hierarchical image filtering method. It consists of two main processes such as normalized R/G ratio which is using the pixel ratios (red and green color channels) and human composition matrix (HCM) based on skin detection. The empirical results show that our analysis methods of text and image are more effective for pornographic Web filtering. Finally, we have modeled a pornographic web filter using content-based analysis into our Anti-X system.
Keywords
Internet; Web sites; image colour analysis; security of data; support vector machines; text analysis; Bayes theorem; Internet; N-gram model; Web pornography patrol system; anti-X system; content-based Web filtering; content-based analysis; human composition matrix; pornographic Web sites; support vector machine algorithm; text analysis; Color; Humans; Image analysis; Information filtering; Information filters; Internet; Natural languages; Safety; Support vector machines; Text analysis; Content-based analysis; Image filtering; Pornographic web filtering; Text filtering;
fLanguage
English
Publisher
ieee
Conference_Titel
Systems, Man and Cybernetics, 2008. SMC 2008. IEEE International Conference on
Conference_Location
Singapore
ISSN
1062-922X
Print_ISBN
978-1-4244-2383-5
Electronic_ISBN
1062-922X
Type
conf
DOI
10.1109/ICSMC.2008.4811326
Filename
4811326
Link To Document