DocumentCode :
1633470
Title :
Enhance Term Weighting Algorithm as Feature Selection Technique for Illicit Web Content Classification
Author :
Lee, Zhi-Sam ; Maarof, Mohd Aizaini ; Selamat, Ali ; Shamsuddin, Siti Mariyam
Author_Institution :
Fac. of Comput. Sci. & Inf. Syst., Univ. Teknol. Malaysia, Skudai
Volume :
2
fYear :
2008
Firstpage :
145
Lastpage :
150
Abstract :
The exponential increase of information in Internet has raise the issue of information security. Pornography Web content is one of the biggest harmful resource that pollute the mind of children and teenagers. Several Web content based analysis approaches had been proposed to avoiding these illicit Web content accessing by the children. However implementation of each solution still remain as an issue. Most of the approaches are weak against classify the high similarity Web content such as pornography and gynecology Web pages. In this study, we try to solve this issue by propose a modified term weighting scheme which used as term feature selection technique for illicit Web page classification. We examine the performance of this proposed technique via three data sets which represent three critical scenarios and compare it with original term weighting scheme. Based on our observation, the proposed technique had shown its superiority for illicit Web pages classification which averagely achieve higher than 90% accuracy rate. Meanwhile the experiment result also denote that the proposed technique had improve from original term weighting scheme. We hope that this study would give other researchers an insight especially who work in the similar area.
Keywords :
Internet; security of data; Internet; enhance term weighting algorithm; feature selection technique; gynecology Web pages; illicit Web content classification; information security; pornography Web content; Business; Entropy; Gynaecology; Image analysis; Information filtering; Information filters; Internet; Pollution; Uniform resource locators; Web pages; feature selection; neural network; term weighting scheme; text categorization; web filtering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Systems Design and Applications, 2008. ISDA '08. Eighth International Conference on
Conference_Location :
Kaohsiung
Print_ISBN :
978-0-7695-3382-7
Type :
conf
DOI :
10.1109/ISDA.2008.171
Filename :
4696322
Link To Document :
بازگشت