DocumentCode :
1852206
Title :
Training anti-spam models with smaller training set via SVM way
Author :
Diao, LiLi ; Yang, Chengzhong
Author_Institution :
Core-Technol., Trend Micro Inc., Nanjing, China
Volume :
2
fYear :
2010
fDate :
1-3 Aug. 2010
Abstract :
In internet era, though emails turn into one of the most popular way for communication, spam emails also bother people seriously. As a result, research on email filtering has become a hot topic with much effort put into this area. Unfortunately, in the real-world application, the large-scale training email dataset which differs from the assumption made in experiment challenges both efficiency and effectiveness. Thus, a new promising method to filter emails is in need. In this paper, we propose an SVM based machine learning method to compress the training set with minimal information loss. The key process is that we reduce large-scale training email set according to the distribution of Support Vectors produced by SVM training. Then a compressed training set is obtained and makes a great contribution to saving time and keeping precision in generating anti-spam models. Experiments show that trained anti-spam classifier can get a better performance by applying our compressing approach.
Keywords :
Internet; information filtering; support vector machines; unsolicited e-mail; Internet era; SVM based machine learning method; anti-spam models; email filtering; smaller training set; Electronic mail; Filtering; Learning systems; Machine learning; Redundancy; Support vector machines; Training; SVM(support vector machine); email filter; machine learning; training set shrink;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Electronics and Information Engineering (ICEIE), 2010 International Conference On
Conference_Location :
Kyoto
Print_ISBN :
978-1-4244-7679-4
Electronic_ISBN :
978-1-4244-7681-7
Type :
conf
DOI :
10.1109/ICEIE.2010.5559725
Filename :
5559725
Link To Document :
بازگشت