• DocumentCode
    263311
  • Title

    Automated generation of ham rules for Vietnamese spam filtering

  • Author

    Quan Dang Dinh ; Quang Anh Tran ; Jiang, Frank

  • Author_Institution
    Fac. of Inf. Technol., Hanoi Univ., Hanoi, Vietnam
  • fYear
    2014
  • fDate
    14-17 Dec. 2014
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    The topic of spam filtering has been thoroughly studied by researchers in the past few decades. There has been successful works with high spam detection rates, yet no paper has described a method which can effectively detect spam and, at the same time, measure the importance of ham emails. In this paper, the authors propose a method of generating SpamAssassin rules which can indicate the degree of importance of an email message. Specifically we added a proportion of negatively weighted ham rules and adapted HPSOWM, an efficient evolutionary algorithm, to optimize SpamAssassin rule scores. As a result, using our new rule set, SpamAssassin is able to give indicative scores for both spam and ham. These scores can be utilized by email clients to categorize incoming messages based on their importance to user. Various experiments were conducted to evaluate our method. In addition, a conclusion was drawn about the best ratio of spam rules and ham rules.
  • Keywords
    e-mail filters; evolutionary computation; information filtering; natural language processing; SpamAssassin rule; Vietnamese spam filtering; adapted HPSOWM; automated generation; email client; email message; evolutionary algorithm; ham email; negatively weighted ham rules; spam detection rate; Accuracy; Educational institutions; Error analysis; Training; Unsolicited electronic mail; HPSOWM; SpamAssassin; automated anti-spam rules; ham rules; spam filtering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence for Security and Defense Applications (CISDA), 2014 Seventh IEEE Symposium on
  • Conference_Location
    Hanoi
  • Type

    conf

  • DOI
    10.1109/CISDA.2014.7035628
  • Filename
    7035628