DocumentCode :
3322394
Title :
A Cluster-based Approach to Filtering Spam under Skewed Class Distributions
Author :
Wen-Feng Hsiao ; Chang, Te-Ming ; Hu, Guo-Hsin
Author_Institution :
Dept. of Inf. Manage., Nat. Pingtung Inst. of Commerce
fYear :
2007
fDate :
Jan. 2007
Firstpage :
53
Lastpage :
53
Abstract :
The purpose of this research is to propose an appropriate classification approach to improving the effectiveness of spam filtering on the issue of skewed class distributions. A clustering-based classifier is proposed to first cluster documents into several groups, and then an equal number of keywords are extracted from each group to alleviate the problem caused by skewed class distributions. Experiments are conducted to validate the effectiveness of the proposed classifier. The results show that our proposed classifier can effectively deal with the issue of skewed class distributions in the task of spam filtering
Keywords :
data mining; pattern classification; pattern clustering; text analysis; unsolicited e-mail; classification; document clustering; keyword extraction; skewed class distribution; spam filtering; text mining; Boosting; Decision trees; Frequency; Information filtering; Information filters; Information management; Matched filters; Support vector machines; Text mining; Unsolicited electronic mail;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
System Sciences, 2007. HICSS 2007. 40th Annual Hawaii International Conference on
Conference_Location :
Waikoloa, HI
ISSN :
1530-1605
Electronic_ISBN :
1530-1605
Type :
conf
DOI :
10.1109/HICSS.2007.7
Filename :
4076478
Link To Document :
بازگشت