Author_Institution :
Dept. of Electr. & Comput. Eng., Clemson Univ., Clemson, SC, USA
Abstract :
The explosive growth of unsolicited e-mails has prompted the development of numerous spam filter techniques. Bayesian spam filters are superior to static keyword-based spam filters in that they can continuously evolve to tackle new spam by learning keywords in new spam emails. However, Bayesian spam filters are easily poisoned by clever spammers who avoid spam keywords and add many innocuous words in their emails. Also, Bayesian spam filters need a significant amount of time to adapt to a new spam based on user feedback. Moreover, few current spam filters exploit social networks to assist in spam detection. In order to develop an accurate and user-friendly spam filter, we propose a SOcial network Aided Personalized and effective spam filter (SOAP) in this paper. In SOAP, each node connects to its social friends; i.e., nodes form a distributed overlay by directly using social network links as overlay links. Each node uses SOAP to collect information and check spam autonomously in a distributed manner. Unlike previous spam filters that focus on parsing keywords (e.g., Bayesian filters) or building blacklists, SOAP exploits the social relationships among email correspondents and their (dis)interests to detect spam adaptively and automatically. In each node, SOAP integrates four components into the basic Bayesian filter: social closeness-based spam filtering, social interest-based spam filtering, adaptive trust management, and friend notification. We have evaluated the performance of SOAP using simulation based on trace data from Facebook. We also have implemented a SOAP prototype for real-world experiments. Experimental results show that SOAP can greatly improve the performance of Bayesian spam filters in terms of accuracy, attack-resilience, and efficiency of spam detection. The performance of the Bayesian spam filter is SOAP´s lower bound.
Keywords :
Bayes methods; information filtering; performance evaluation; social networking (online); trusted computing; unsolicited e-mail; Bayesian filter; Bayesian spam filters; Facebook; SOAP prototype; adaptive trust management; attack-resilience; clever spammers; distributed overlay; email correspondents; friend notification; overlay links; parsing keywords; performance evaluation; social closeness-based spam filtering; social friends; social interest-based spam filtering; social network aided personalization; social network links; social networks; spam detection; spam emails; spam filter techniques; spam keywords; static keyword-based spam filters; trace data; unsolicited e-mails; user feedback; user-friendly spam filter; Bayes methods; Simple object access protocol; Social network services; Unsolicited electronic mail; Bayesian spam filters; Distributed overlays; social networks; spam filtering;