DocumentCode :
3008784
Title :
Classification of email using BeaKS: Behavior and keyword stemming
Author :
Bhat, Veena H. ; Malkani, Vandana R. ; Shenoy, P. Deepa ; Venugopal, K.R. ; Patnaik, L.M.
Author_Institution :
Dept. of CSE, Univ. Visvesvaraya, Bangalore, India
fYear :
2011
fDate :
21-24 Nov. 2011
Firstpage :
1139
Lastpage :
1143
Abstract :
Spam mails are one of the greatest challenges faced by internet service providers, organizations and internet users in unison. Spam mails may be targeted, with a malicious intent or just as a commercial marketing activity - on the whole unwanted by everyone except the dispatcher. Spam filters continuously evolve as spammers go techno-savvy and creative. Machine learning algorithms have been popularly used for classifying and predicting mails as spam or ham (the good emails). This work presents a spam filter, BeaKS, with a focused preprocessing phase that weaves both the content of the email and two behavioral characteristics extracted from the email, to predict the category a mail belongs to: spam or ham. The accuracy of the proposed prediction model using Random Forests as the classifier is shown to be superior over other recent techniques. This approach is simple, easy to implement and reliable.
Keywords :
learning (artificial intelligence); pattern classification; security of data; unsolicited e-mail; BeaKS; behavioral characteristics; classifier; email classification; ham mails; keyword stemming; machine learning algorithms; random forests; spam filters; spam mails; Accuracy; Artificial neural networks; Feature extraction; Niobium; Postal services; Unsolicited electronic mail; Email classification; email content; machine learning; random forests; spammer behaviour;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
TENCON 2011 - 2011 IEEE Region 10 Conference
Conference_Location :
Bali
ISSN :
2159-3442
Print_ISBN :
978-1-4577-0256-3
Type :
conf
DOI :
10.1109/TENCON.2011.6129290
Filename :
6129290
Link To Document :
بازگشت