DocumentCode
2264177
Title
Applying Tesseract-OCR to detection of image spam mails
Author
Yamakawa, Daisuke ; Yoshiura, Noriaki
Author_Institution
Dept. of Inf. & Comput. Sci., Saitama Univ., Saitama, Japan
fYear
2012
fDate
25-27 Sept. 2012
Firstpage
1
Lastpage
4
Abstract
This paper applies Tesseract-OCR, optical character recognition software, to image spam mail filters. Tesseract-OCR can be specific to a certain language and this paper makes Tesseract-OCR specific to spam words. This specialization decreases times and CPU power that it takes to check whether images of mails include spam words. This paper examines the ability of the spam mail filter of Tesseract-OCR by experiment.
Keywords
object detection; optical character recognition; unsolicited e-mail; Tesseract-OCR; image spam mails detection; optical character recognition software; spam mail filter; spam words; Character recognition; Drugs; Image recognition; Optical character recognition software; Postal services; Training; Unsolicited electronic mail; Computer Network Security; Image processing; Spam mail;
fLanguage
English
Publisher
ieee
Conference_Titel
Network Operations and Management Symposium (APNOMS), 2012 14th Asia-Pacific
Conference_Location
Seoul
Print_ISBN
978-1-4673-4494-4
Electronic_ISBN
978-1-4673-4495-1
Type
conf
DOI
10.1109/APNOMS.2012.6356068
Filename
6356068
Link To Document