• DocumentCode
    2956651
  • Title

    A Novel Spam Filtering Framework Based on Fuzzy Adaptive Particle Swarm Optimization

  • Author

    Wu, Hao ; Li, Hong-zuo ; Wang, Gang ; Chen, Hui-ling ; Li, Xiao-kui

  • Volume
    1
  • fYear
    2011
  • fDate
    28-29 March 2011
  • Firstpage
    38
  • Lastpage
    41
  • Abstract
    E-mail is a major revolution taking place over traditional communication systems due to its convenient, economical, fast, and easy to use nature. A major bottleneck in electronic communications is the enormous dissemination of unwanted, harmful emails known as spam emails. In this paper, a novel spam filtering framework (NSFF) is proposed, which is based on particle swarm optimization, fuzzy logic control, F-score and support vector machine (SVM). We propose a fuzzy adaptive particle swarm optimization (FAPSO) to find an optimal feature subset. In order to identify a subset of features embedded out of a large dataset which is contaminated with high dimensional noise, the proposed method is divided into three stages, namely core feature subset selection, feature subset selection and spam filtering. In the first stage, F-score is used to calculate the importance of each feature, and construct a core feature set, thus obtaining a number of core feature subsets. In the second stage, FAPSO is initialized from the core feature subset and adjusted adaptively via the fuzzy logic control, thereupon obtaining an optimal feature subset. In the final stage, support vector machine is employed as the classifier. According to the optimal feature subset, the input e-mails are classified via SVM. Three publicly available benchmark corpora for spam filtering, the PU1, Ling-Spam and Spam Assassin, are used in our experiments. The numerical results and statistical analysis show that the proposed approach is capable of finding an optimal feature subset from a large noisy data set. In addition, NSFF performs significantly better than the other methods in terms of prediction accuracy with smaller subset of features.
  • Keywords
    benchmark testing; e-mail filters; fuzzy control; particle swarm optimisation; pattern classification; set theory; support vector machines; unsolicited e-mail; F-score; SVM classifier; benchmark corpora; e-mail; electronic communication; feature subset selection; fuzzy adaptive particle swarm optimization; fuzzy logic control; spam filtering framework; support vector machine; Accuracy; Filtering; Frequency modulation; Optical fibers; Particle swarm optimization; Support vector machines; Unsolicited electronic mail; feature selection; particle swarm optimization; spam filtering; support vector machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Computation Technology and Automation (ICICTA), 2011 International Conference on
  • Conference_Location
    Shenzhen, Guangdong
  • Print_ISBN
    978-1-61284-289-9
  • Type

    conf

  • DOI
    10.1109/ICICTA.2011.17
  • Filename
    5750527