مرکز منطقه ای اطلاع رساني علوم و فناوري - Combating Good Word Attacks on Statistical Spam Filters with Multiple Instance Learning

DocumentCode :

2485575

Title :

Combating Good Word Attacks on Statistical Spam Filters with Multiple Instance Learning

Author :

Zhou, Yan ; Jorgensen, Zach ; Inge, Meador

Author_Institution :

Univ. of South Alabama, Mobile

Volume :

fYear :

2007

fDate :

29-31 Oct. 2007

Firstpage :

298

Lastpage :

305

Abstract :

Statistical spam filters are known to be vulnerable to adversarial attacks. One such adversarial attack, known as the good word attack, thwarts spam filters by appending to spam messages sets of "good" words, which are common in legitimate e-mail but rare in spam. We present a counter attack strategy that first attempts to differentiate spam from legitimate e-mail in the input space, by transforming each e- mail into a bag of multiple segments, and subsequently applies multiple instance logistic regression on the bags. We treat each segment in the bag as an instance. An e-mail is classified as spam if at least one instance in the corresponding bag is spam, and as legitimate if all the instances in it are legitimate. We show that a spam filter using our multiple instance counter-attack strategy stands up better to good word attacks than its single instance counterpart and the commonly practiced Bayesian filters.

Keywords :

information filters; security of data; statistical analysis; unsolicited e-mail; adversarial attacks; combating good word attacks; counter-attack strategy; legitimate e-mail; multiple instance learning; multiple instance logistic regression; statistical spam filters; thwarts spam filters; Artificial intelligence; Drugs; Electronic mail; Information filtering; Information filters; Learning; Logistics; Mobile computing; USA Councils; Unsolicited electronic mail;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Tools with Artificial Intelligence, 2007. ICTAI 2007. 19th IEEE International Conference on

Conference_Location :

Patras

ISSN :

1082-3409

Print_ISBN :

978-0-7695-3015-4

Type :

conf

DOI :

10.1109/ICTAI.2007.120

Filename :

4410395

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2485575