Title :
Interactive Spam Filtering with Active Learning and Feature Selection
Author :
Okabe, Masayuki ; Yamada, Seiji
Author_Institution :
Toyohashi Univ. of Technol., Toyohashi
Abstract :
This paper proposes an interactive spam filtering method that utilizes active learning and feature selection. Identifying effective features are very important in spam filtering because spam mails include so many meaningless words that are slightly different from each other. Thus identifying effective and ineffective features is promising approach.Although traditional feature selection methods have been done based on some amount of labeled training data, this assumption does not hold in interactive spam filtering. We propose a method to identify effective features through active learning in spam filtering using naive Bayes approach. Experimental results show that our method outperforms traditional methods that operate with no feature selection.
Keywords :
Bayes methods; feature extraction; information filtering; learning (artificial intelligence); probability; unsolicited e-mail; active learning; feature selection; interactive spam filtering method; naive Bayes approach; probability; spam mail; Active filters; Electronic mail; Information filtering; Information filters; Intelligent agent; Postal services; Sampling methods; Training data; Uncertainty; Unsolicited electronic mail; active learning; spam filtering;
Conference_Titel :
Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT '08. IEEE/WIC/ACM International Conference on
Conference_Location :
Sydney, NSW
Print_ISBN :
978-0-7695-3496-1
DOI :
10.1109/WIIAT.2008.336