DocumentCode
459444
Title
Traffic classification-based spam filter
Author
Zhang, Ni ; Jiang, Yu ; Fang, Binxing ; Cheng, Xueqi ; Guo, Li
Author_Institution
Software Division, Institute of Computing Technology, Chinese Academy of Sciences, 100080, Beijing, China; Graduate School of Chinese Academy of Sciences, 100039, Beijing, China
Volume
5
fYear
2006
fDate
38869
Firstpage
2130
Lastpage
2135
Abstract
We propose an unsupervised spam filter called Bulk Mail Traffic Classification (BMTC) for filtering junk mails from the perspective of ISPs. Our insight is that spammers generally sent mass unsolicited emails with few alterations to a common message content, which can be found at an extensive traffic environment. In our approach, we classify email delivery traffic into different categories by the similarity of message contents. Then we can decide whether or not a particular email category is spam by the number of similar mails of this category and take measures to filter it. We also design a simulator, two sketches data structure, and a series of algorithms to support our method. We have applied BMTC to email traffic data captured at one of the largest commercial Internet service providers in China, and the experimental result indicates that a 70.4% reduction of emails can be achieved with our method. The results also show that BMTC is practical. We can implement it in a high-volume traffic environment handling over millions of mails every day with small memory consumption.
Keywords
Data structures; Delay; Information filtering; Information filters; Postal services; Protection; Telecommunication traffic; Traffic control; Unsolicited electronic mail; Web and internet services;
fLanguage
English
Publisher
ieee
Conference_Titel
Communications, 2006. ICC '06. IEEE International Conference on
Conference_Location
Istanbul
ISSN
8164-9547
Print_ISBN
1-4244-0355-3
Electronic_ISBN
8164-9547
Type
conf
DOI
10.1109/ICC.2006.255085
Filename
4024480
Link To Document