Title :
Applying Tesseract-OCR to detection of image spam mails
Author :
Yamakawa, Daisuke ; Yoshiura, Noriaki
Author_Institution :
Dept. of Inf. & Comput. Sci., Saitama Univ., Saitama, Japan
Abstract :
This paper applies Tesseract-OCR, optical character recognition software, to image spam mail filters. Tesseract-OCR can be specific to a certain language and this paper makes Tesseract-OCR specific to spam words. This specialization decreases times and CPU power that it takes to check whether images of mails include spam words. This paper examines the ability of the spam mail filter of Tesseract-OCR by experiment.
Keywords :
object detection; optical character recognition; unsolicited e-mail; Tesseract-OCR; image spam mails detection; optical character recognition software; spam mail filter; spam words; Character recognition; Drugs; Image recognition; Optical character recognition software; Postal services; Training; Unsolicited electronic mail; Computer Network Security; Image processing; Spam mail;
Conference_Titel :
Network Operations and Management Symposium (APNOMS), 2012 14th Asia-Pacific
Conference_Location :
Seoul
Print_ISBN :
978-1-4673-4494-4
Electronic_ISBN :
978-1-4673-4495-1
DOI :
10.1109/APNOMS.2012.6356068