• DocumentCode
    2485563
  • Title

    Automatic Personalized Spam Filtering through Significant Word Modeling

  • Author

    Junejo, Khurum Nazir ; Karim, Asim

  • Author_Institution
    Lahore Univ. of Manage. Sci., Lahore
  • Volume
    2
  • fYear
    2007
  • fDate
    29-31 Oct. 2007
  • Firstpage
    291
  • Lastpage
    298
  • Abstract
    Typically, spam filters are built on the assumption that the characteristics of e-mails in the training set is identical to those in individual users´ inboxes on which it will be applied. This assumption is oftentimes incorrect leading to poor performance of the filter. A personalized spam filter is built by taking into account the characteristics of e-mails in individual users´ inboxes. We present an automatic approach for personalized spam filtering that does not require users´ feedback. The proposed algorithm builds a statistical model of significant spam and non-spam words from the labeled training set and then updates it in multiple passes over the unlabeled individual user´s inbox. The personalization of the model leads to improved filtering performance. We evaluate our algorithm on two publicly available datasets. The results show that our algorithm is robust and scalable, and a viable solution to the server-side personalized spam filtering problem. Moreover, it outperforms published results on one dataset and its performance is equivalent to the others on the second dataset.
  • Keywords
    information filtering; statistical analysis; unsolicited e-mail; automatic personalized spam filtering; available datasets; e-mails; labeled training set; statistical model; training set; word modeling; Artificial intelligence; Computer science; Conference management; Electronic mail; Feedback; Filtering algorithms; Filters; Management training; Robustness; Unsolicited electronic mail;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Tools with Artificial Intelligence, 2007. ICTAI 2007. 19th IEEE International Conference on
  • Conference_Location
    Patras
  • ISSN
    1082-3409
  • Print_ISBN
    978-0-7695-3015-4
  • Type

    conf

  • DOI
    10.1109/ICTAI.2007.66
  • Filename
    4410394