DocumentCode
566751
Title
A hierarchical framework for content-based image spam filtering
Author
Li, Xiao Mang ; Kim, Ung Mo
Author_Institution
Sch. of Inf. & Commun. Eng., Sungkyunkwan Univ., Suwon, South Korea
Volume
1
fYear
2012
fDate
26-28 June 2012
Firstpage
149
Lastpage
155
Abstract
Since 1990s, as the problem of spam has become a serious threat to email communication, the prolonged competition between spammers and anti-spam filters has begun and lasted until today. In order to filter spam based on the semantic analysis of email content, many content-based anti-spam approaches have been put forward, such as text-based filtering, image-based filtering, etc. However, the tricks played by spammers are also evolved quickly. Nowadays, it turns out that the capability of any single anti-spam approach is too limited to handle diverse real-world spam effectively. So, how to combine current techniques to construct more effective anti-spam systems has become the major focus of our research. In this paper, we propose a novel hierarchical anti-spam framework, which adopts multiple techniques including text classification, image processing and Optical Character Recognition in different layers to detect spam. We evaluate the proposed approach on several public spam corpora as well as our personal corpus, and verify the effectiveness of the proposed approach in terms of the filtering capacity and filtering performance.
Keywords
classification; content-based retrieval; information filtering; optical character recognition; text analysis; unsolicited e-mail; anti-spam filter; anti-spam system; content-based anti-spam approach; content-based image spam filtering; email communication; email content; filtering capacity; filtering performance; hierarchical framework; image processing; image-based filtering; optical character recognition; personal corpus; public spam corpora; real-world spam; semantic analysis; spam detection; spammer; text classification; text-based filtering; Accuracy; Training; framework; image processing; spam filtering;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Science and Digital Content Technology (ICIDT), 2012 8th International Conference on
Conference_Location
Jeju
Print_ISBN
978-1-4673-1288-2
Type
conf
Filename
6269246
Link To Document