DocumentCode :
2760754
Title :
Content-based concept drift detection for Email spam filtering
Author :
Hayat, Morteza Zi ; Basiri, Javad ; Seyedhossein, Leila ; Shakery, Azadeh
Author_Institution :
Sch. of Electr. & Comput. Eng., Univ. of Tehran, Tehran, Iran
fYear :
2010
fDate :
4-6 Dec. 2010
Firstpage :
531
Lastpage :
536
Abstract :
The continued growth of Email usage, which is naturally followed by an increase in unsolicited emails so called spams, motivates research in spam filtering area. In the context of spam filtering systems, addressing the evolving nature of spams, which leads to obsolete the related models, has been always a challenge. In this paper an adaptive spam filtering system based on language model is proposed which can detect concept drift based on computing the deviation in email contents distribution. The proposed method can be used along with any existing classifier; particularly in this paper we use Naïve Bayes method as classifier. The proposed method has been evaluated with Enron data set. The results indicate the efficiency of the method in detecting concept drift and its superiority over Naïve Bayes classifier in terms of accuracy.
Keywords :
pattern classification; security of data; unsolicited e-mail; content-based concept drift detection; email content distribution deviation; email spam filtering; naive Bayes classifier method; unsolicited emails; Accuracy; Adaptation model; Computational modeling; Electronic mail; Filtering; Testing; Training data; KL divergence; concept drift; language model; spam filtering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Telecommunications (IST), 2010 5th International Symposium on
Conference_Location :
Tehran
Print_ISBN :
978-1-4244-8183-5
Type :
conf
DOI :
10.1109/ISTEL.2010.5734082
Filename :
5734082
Link To Document :
بازگشت