Title :
Statistical and structural approaches to filtering Internet pornography
Author :
Ho, Wai H. ; Watters, Paul A.
Author_Institution :
Div. of Inf. & Comput. Sci., Macquarie Univ., Sydney, NSW
Abstract :
The WWW is a major source of unintentional exposure to pornography. Current content filtering technology using blacklisting or simple keyword searching is ineffective - today´s filters have many false positives and negatives, and require tedious manual updating. This study examined how content filtering of pornographic Web page text, based on structural and statistical analysis, could greatly improve accuracy. Systematic differences between pornographic and nonpornographic Web pages were found, with Bayesian classification yielding 99.1% accuracy in text classification from pornographic and non-pornographic corpora
Keywords :
Bayes methods; Internet; Web sites; authorisation; content management; statistical analysis; Bayesian classification; Internet pornography filtering; World Wide Web; blacklisting; content filtering; statistical analysis; Bayesian methods; Information filtering; Information filters; Internet; Keyword search; Manuals; Statistical analysis; Text categorization; Web pages; World Wide Web;
Conference_Titel :
Systems, Man and Cybernetics, 2004 IEEE International Conference on
Conference_Location :
The Hague
Print_ISBN :
0-7803-8566-7
DOI :
10.1109/ICSMC.2004.1401289