DocumentCode
263311
Title
Automated generation of ham rules for Vietnamese spam filtering
Author
Quan Dang Dinh ; Quang Anh Tran ; Jiang, Frank
Author_Institution
Fac. of Inf. Technol., Hanoi Univ., Hanoi, Vietnam
fYear
2014
fDate
14-17 Dec. 2014
Firstpage
1
Lastpage
5
Abstract
The topic of spam filtering has been thoroughly studied by researchers in the past few decades. There has been successful works with high spam detection rates, yet no paper has described a method which can effectively detect spam and, at the same time, measure the importance of ham emails. In this paper, the authors propose a method of generating SpamAssassin rules which can indicate the degree of importance of an email message. Specifically we added a proportion of negatively weighted ham rules and adapted HPSOWM, an efficient evolutionary algorithm, to optimize SpamAssassin rule scores. As a result, using our new rule set, SpamAssassin is able to give indicative scores for both spam and ham. These scores can be utilized by email clients to categorize incoming messages based on their importance to user. Various experiments were conducted to evaluate our method. In addition, a conclusion was drawn about the best ratio of spam rules and ham rules.
Keywords
e-mail filters; evolutionary computation; information filtering; natural language processing; SpamAssassin rule; Vietnamese spam filtering; adapted HPSOWM; automated generation; email client; email message; evolutionary algorithm; ham email; negatively weighted ham rules; spam detection rate; Accuracy; Educational institutions; Error analysis; Training; Unsolicited electronic mail; HPSOWM; SpamAssassin; automated anti-spam rules; ham rules; spam filtering;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Intelligence for Security and Defense Applications (CISDA), 2014 Seventh IEEE Symposium on
Conference_Location
Hanoi
Type
conf
DOI
10.1109/CISDA.2014.7035628
Filename
7035628
Link To Document