Title :
Considering behavior of sender in spam mail detection
Author :
Naksomboon, S. ; Charnsripinyo, C. ; Wattanapongsakorn, N.
Author_Institution :
Comput. Eng. Dept., King Mongkut´´s Univ. of Technol. Thonburi, Bangkok, Thailand
Abstract :
Recently, the number of spam mails is exponentially growing. It affects the costs of organizations and annoying the e-mail recipient. Spammers always try to find the way to avoid filtering out from the email system. At the same time, as an email recipient or network system/administrator, we try to have an effective spam mail filtering technique to catch the spam mails. The problems of spam mail filtering are that each user has different perspective toward spam mails; so there are many types of spam mails, while the challenge is how to detect the various types and forms of spam mails. In this paper, behaviors of spammers are used to customize the filtering rule. The information from the spam messages also can be used to filter spam mails and it can give higher accuracy than the keyword-based method does. We propose a spam classification approach using Random Forest algorithm. Spam Assassin Corpus is selected as a database for classification. It consists of 6,047 email messages, where 4,150 of them are the legitimate messages and the other 1,897 messages are the spam mails.
Keywords :
Artificial neural networks; Bayesian methods; Business; Computer networks; Electronic mail; Filtering; Filters; Internet; Postal services; Unsolicited electronic mail; Data classification; Random Forest; Spam Assassin dataset; Spam mail detection;
Conference_Titel :
Networked Computing (INC), 2010 6th International Conference on
Conference_Location :
Gyeongju, Korea (South)
Print_ISBN :
978-1-4244-6986-4
Electronic_ISBN :
978-89-88678-20-6