DocumentCode :
3194304
Title :
Text localization in spam image using edge features
Author :
Wan, Mingcheng ; Zhang, Fengli ; Cheng, Hongrong ; Liu, Qiao
Author_Institution :
Sch. of Comput. Sci.&Eng., Univ. of Electron. Sci. & Technol. of China, Chengdu
fYear :
2008
fDate :
25-27 May 2008
Firstpage :
838
Lastpage :
842
Abstract :
Nowadays more and more spam emails convey spam messages in a human readable image instead of text, making detection by conventional content filters difficult. However, the text information contained in spam images can be very useful for spam detection. Our goal in this paper is to propose an effective algorithm for text localization in spam images, the basic idea is to discriminate the non-text edges with some selected features of edges. Furthermore, we construct a corner detection algorithm based on a circular template to predict the corner points of the text in an image, which is crucial for text localization. Our evaluation shows that this algorithm can identify 96% of texts contained in spam images and the precision can reach up to 97.6% on real world data (spam image samples come from the SpamArchive public dataset).
Keywords :
e-mail filters; edge detection; feature extraction; text analysis; unsolicited e-mail; corner detection; edge features; spam detection; spam emails; spam images; spam messages; text localization; Automatic testing; Computer science; Data mining; Detection algorithms; Filters; Humans; Image analysis; Image edge detection; Image storage; Unsolicited electronic mail;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Communications, Circuits and Systems, 2008. ICCCAS 2008. International Conference on
Conference_Location :
Fujian
Print_ISBN :
978-1-4244-2063-6
Electronic_ISBN :
978-1-4244-2064-3
Type :
conf
DOI :
10.1109/ICCCAS.2008.4657900
Filename :
4657900
Link To Document :
بازگشت