Title :
Image analysis for efficient categorization of image-based spam e-mail
Author :
Aradhye, Hrishikesh B. ; Myers, Gregory K. ; Herson, James A.
Author_Institution :
SRI Int., Menlo Park, CA, USA
fDate :
29 Aug.-1 Sept. 2005
Abstract :
To circumvent prevalent text-based anti-spam filters, spammers have begun embedding the advertisement text in images. Analogously, proprietary information (such as source code) may be communicated as screenshots to defeat text-based monitoring of outbound e-mail. The proposed method separates spam images from other common categories of e-mail images based on extracted overlay text and color features. No expensive OCR processing is necessary. Our method works robustly in spite of complex backgrounds, compression artifacts, and a wide variety of formats and fonts of overlaid spam text. It is also demonstrated successfully to detect screen-shots in outbound e-mail.
Keywords :
feature extraction; image classification; image colour analysis; text analysis; unsolicited e-mail; color feature extraction; image analysis; image-based spam e-mail categorization; outbound e-mail; text feature extraction; text-based anti-spam filter; text-based monitoring; Data mining; Electronic mail; Filters; Image analysis; Image color analysis; Monitoring; Optical character recognition software; Robustness; Text recognition; Unsolicited electronic mail;
Conference_Titel :
Document Analysis and Recognition, 2005. Proceedings. Eighth International Conference on
Print_ISBN :
0-7695-2420-6
DOI :
10.1109/ICDAR.2005.135