Title :
Feature selection for Spam and Phishing detection
Author :
Toolan, Fergus ; Carthy, Joe
Author_Institution :
UCD Centre for Cybercrime Investig., Univ. Coll. Dublin, Dublin, Ireland
Abstract :
Unsolicited Bulk Email (UBE) has become a large problem in recent years. The number of mass mailers in existence is increasing dramatically. Automatically detecting UBE has become a vital area of current research. Many email clients (such as Outlook and Thunderbird) already have junk filters built in. Mass mailers are continually evolving and overcoming some of the junk filters. This means that the need for research in the area is ongoing. Many existing techniques seem to randomly choose the features that will be used for classification. This paper aims to address this issue by investigating the utility of over 40 features that have been used in recent literature. Information gain for these features are calculated over Ham, Spam and Phishing corpora.
Keywords :
computer crime; e-mail filters; unsolicited e-mail; Ham corpora; feature selection; junk filters; phishing detection; spam detection; unsolicited bulk email; Equations; Feature extraction; HTML; IP networks; Suspensions; Unsolicited electronic mail;
Conference_Titel :
eCrime Researchers Summit (eCrime), 2010
Conference_Location :
Dallas, TX
Print_ISBN :
978-1-4244-7760-9
DOI :
10.1109/ecrime.2010.5706696