DocumentCode :
2482668
Title :
An Online Linear Chinese Spam Emails Filtering System
Author :
Qiu, Yongqin ; Xu, Yan ; Wang, Bin
Author_Institution :
Beijing Language & Culture Univ., Beijing, China
fYear :
2010
fDate :
22-23 May 2010
Firstpage :
1
Lastpage :
4
Abstract :
Spam is a key problem in electronic communication. The increasing volume of spam has become a serious threat not only to the Internet, but also to society. Content-based filtering is one mainstream method of combating this threat in its various forms, but the previous Content-based filtering methods are hard to find a balance between efficiency and effectiveness. In this paper we intend to seek a linear solve for this problem, and two online linear classifiers: the Perceptron and Winnow are explored for this task in three benchmark corpora, which include English corpus PU1, Lingspam and Chinese corpus 2005-Jun, Our experiments conclude that both of these classifiers can filter spam emails effectively as well as efficiently. It is also show that they perform much better than a standard Naïve Bayes method. In fact, to the best of our knowledge, they have a state-of-the-art performance for filtering Chinese spam emails, at least on the above corpora. Furthermore, both of the two classifiers are easily adaptively updated, thus are suitable for real dynamic environment.
Keywords :
Bayes methods; Internet; e-mail filters; perceptrons; security of data; unsolicited e-mail; Chinese corpus; Chinese spam emails filtering system; English corpus PU1; Internet threat; Lingspam; Perceptron; Winnow; content-based filtering; corpora; electronic communication; mainstream method; online linear classifiers; real dynamic environment; society threat; standard Naive Bayes method; state-of-the-art performance; Bayesian methods; Computers; Electronic mail; Information filtering; Information filters; Internet; Natural languages; Niobium; Nonlinear filters; Unsolicited electronic mail;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
e-Business and Information System Security (EBISS), 2010 2nd International Conference on
Conference_Location :
Wuhan
Print_ISBN :
978-1-4244-5893-6
Electronic_ISBN :
978-1-4244-5895-0
Type :
conf
DOI :
10.1109/EBISS.2010.5473478
Filename :
5473478
Link To Document :
بازگشت