Title :
An improved selective ensemble method for spam filtering
Author :
Jinye Cai ; Pingping Xu ; Huiyu Tang ; Lin Sun
Author_Institution :
Nat. Mobile Commun. Res. Lab., Southeast Univ., Nanjing, China
Abstract :
This paper presents an improved method of selective ensemble to filter the spam messages. The design adopts clustering based on the diversity between sub-classifiers to solve the problem of selection. To improve accuracy and stability, a conception of confidence weight is proposed to evaluate the reliability of selected sub-classifiers. The training model is created with small datasets as in the real situation. For practical usage, this method only uses 150 samples of user´s file and executes bootstrapping between 50 and 70 times on them. Experiments validate the effectiveness of this method in handling the spam filtering problem.
Keywords :
information filtering; pattern classification; pattern clustering; security of data; software reliability; unsolicited e-mail; bootstrapping; confidence weight; improved selective ensemble method; selected sub-classifier reliability evaluation; spam filtering problem; training model; Accuracy; Bagging; Databases; Filtering; Support vector machines; Training; Unsolicited electronic mail; Classification; Clustering; SVM; Selective ensemble; Spam filtering; Text mining;
Conference_Titel :
Communication Technology (ICCT), 2013 15th IEEE International Conference on
Conference_Location :
Guilin
DOI :
10.1109/ICCT.2013.6820473