Title :
Revisit Bayesian Approaches for Spam Detection
Author :
Yeh, Chun-Chao ; Chiang, Soun-Jan
Author_Institution :
Dept. of Comput. Sci., Nat. Taiwan Univ., Keelung
Abstract :
Due to fast changing of spam techniques, we argue that multiple spam detection strategies should be developed to effectively against spam. Among many others, (naive) Bayesian filter is one of those being proposed. While there are some early works being done on Bayesian approaches for spam detection, it is unknown if such a simple approach is still effective to filter spam mail today. In this paper, we re-evaluate the Bayesian approach with a public spam database. We found that Bayesian approach can achieve high spam detection rate for plain-text mail. However we found, at the same time, some practical issues to use Bayesian filters, for example multimedia contents and different message encoding schemes for non-English characters, are needed to be carefully handled.
Keywords :
Bayes methods; security of data; unsolicited e-mail; multiple spam detection; naive Bayesian filter; public spam database; Bayesian methods; Computer science; Electronic mail; Encoding; HTML; Multimedia databases; Postal services; Support vector machines; Unsolicited electronic mail; Web and internet services; naive Bayesian filer; spam; spam detection;
Conference_Titel :
Young Computer Scientists, 2008. ICYCS 2008. The 9th International Conference for
Conference_Location :
Hunan
Print_ISBN :
978-0-7695-3398-8
Electronic_ISBN :
978-0-7695-3398-8
DOI :
10.1109/ICYCS.2008.434