DocumentCode :
2490441
Title :
Heterogeneous Bayesian ensembles for classifying spam emails
Author :
Wang, Wenjia
Author_Institution :
Sch. of Comput. Sci., Univ. of East Anglia, Norwich, UK
fYear :
2010
fDate :
18-23 July 2010
Firstpage :
1
Lastpage :
8
Abstract :
Spam emails have become a major problem in internet communication and can cause potentially serious adverse effects on the recipients if unidentified. Many spam filters have been developed to filter out certain spam emails, but as spammers continuously improve their spamming techniques, the exiting filters may become less effective. This paper presents a heterogeneous ensemble approach that combines several methodologically different filters to work collectively to improve accuracy and reliability in identifying spam emails. A special procedure for building heterogeneous and homogeneous ensembles with Bayesian filter as base learner has been devised and a framework has been designed and implemented. After verifying the framework intensively with 10 other benchmark data sets, it was applied to identify spam emails. The experiments with a spam benchmark corpus indicated that the heterogeneous ensembles achieved more accurate and reliable classifications than the individual and other ensemble filters.
Keywords :
Internet; belief networks; information filtering; software reliability; unsolicited e-mail; Bayesian filter; Internet communication; heterogeneous Bayesian ensembles; spam benchmark corpus; spam emails; spamming techniques; Accuracy; Buildings; Data models; Electronic mail; Filtering algorithms; Niobium; Probability;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks (IJCNN), The 2010 International Joint Conference on
Conference_Location :
Barcelona
ISSN :
1098-7576
Print_ISBN :
978-1-4244-6916-1
Type :
conf
DOI :
10.1109/IJCNN.2010.5596545
Filename :
5596545
Link To Document :
بازگشت