DocumentCode :
3140026
Title :
Incremental Naïve Bayesian Spam Mail Filtering and Variant Incremental Training
Author :
Taninpong, Phimphaka ; Ngamsuriyaroj, Sudsanguan
Author_Institution :
Dept. of Comput. Sci., Mahidol Univ., Bangkok, Thailand
fYear :
2009
fDate :
1-3 June 2009
Firstpage :
383
Lastpage :
387
Abstract :
This paper proposes an incremental spam mail filtering using Naive Bayesian classification which gives simplicity and adaptability. To keep the training set to a limited size and small, the sliding window is applied and the training set is updated when new emails are received. In effect, features in the training set are incrementally updated, and the model would be adaptive to a new spam pattern. In addition, we present three incremental training schemes: a window containing only the most recent emails, a window containing the previous batch of emails, and a window containing all already seen emails. The proposed model is evaluated using two spam corpora: Trec05p-1 and Trec06p. In our experiments, the window size is varied, the processing time per message, and the ham and spam misclassification rates are measured. The results show that the third incremental training scheme gives the best outcomes, and the window size significantly affects the misclassification rates and the processing time.
Keywords :
belief networks; e-mail filters; pattern classification; unsolicited e-mail; Naive Bayesian classification; Naive Bayesian spam mail filtering; Trec05p-1; Trec06p; incremental spam mail filtering; sliding window; spam misclassification rates; spam pattern; training set; variant incremental training; Availability; Bayesian methods; Computer network reliability; Computer networks; Computer science; Filtering; Peer to peer computing; Postal services; Space technology; Unsolicited electronic mail; Naïve Bayesian classification; incremental; spam mail filtering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer and Information Science, 2009. ICIS 2009. Eighth IEEE/ACIS International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-0-7695-3641-5
Type :
conf
DOI :
10.1109/ICIS.2009.176
Filename :
5222907
Link To Document :
بازگشت