DocumentCode :
1797361
Title :
Deobfuscation based on edit distance algorithm for spam filitering
Author :
Xinwang Zhong
Author_Institution :
Dept. of Comput. Sci. & Eng., South China Univ. of Technol., Guangzhou, China
Volume :
1
fYear :
2014
fDate :
13-16 July 2014
Firstpage :
109
Lastpage :
114
Abstract :
Spamming problem has been grown rapidly in the Internet. An adversary obfuscates the spam message by misspelling or inserting useless characters to mislead the decision of the spam filter. Humans still can understand the original meaning of the camouflaged words but the spam filter cannot recognize them. This paper focuses on the well-known obfuscation problem which uses non-alphabetical characters, e.g. Viagra is modified to V!@gr@. The string edit distance algorithm is revised for handling the non-alphabetical characters. The proposed deobfuscation method outperforms than the traditional string edit distance algorithm in the experiment.
Keywords :
formal languages; information filtering; support vector machines; unsolicited e-mail; SVM; backtrack algorithm; deobfuscation method; nonalphabetical character handling; obfuscation problem; spam filtering; spam message; spamming problem; string edit distance algorithm; support vector machine; Abstracts; Barium; Indexes; Support vector machines; Unsolicited electronic mail; Backtrack algorithm; SVM; Spam Filter; String Edit Distance algorithm;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Cybernetics (ICMLC), 2014 International Conference on
Conference_Location :
Lanzhou
ISSN :
2160-133X
Print_ISBN :
978-1-4799-4216-9
Type :
conf
DOI :
10.1109/ICMLC.2014.7009101
Filename :
7009101
Link To Document :
بازگشت