DocumentCode :
3417266
Title :
Enhanced content analysis of fraudulent Nigeria electronic mails using e-STAT
Author :
Longe, O.B. ; Abayomi-Alli, A. ; Shaib, I. I O ; Longe, F.A.
Author_Institution :
Int. Centre For IT & Dev., Southern Univ., Baton Rouge, LA, USA
fYear :
2009
fDate :
14-16 Jan. 2009
Firstpage :
238
Lastpage :
243
Abstract :
A large percentage of fraudulent spam mails are believed to originate from Nigeria or from Nigerians in remote locations. These mails (popularly referred to as 419 spam) come in broad categories but all with the intent of defrauding the recipients´. Testing the validity of senders and receivers address is one method that has been used to filter spam mails. This approach will not filter out ordinary e-mails since typical e-mail users will always include their true e-mail addresses to facilitate replies. Checking the IP-addresses of 419 mails is a way of ascertaining their actual origin. This can be done with the intention to build a database of e-mail abuse or to blacklist addresses from which fraudulent mails are originating keeping in mind that blacklisted IP addresses could be used to stop the delivery of further mails from such addresses in the future. To this end, this research examines features selected specifically from the content analysis of Nigeria spam e-mail. A domain specific statistical content analysis tool (e-STAT) was developed and implemented using Bayesian statistical technique. The software was tested and trained with a sizeable balanced corpus of Nigerian 419 spam e-mails and normal (ham) e-mails. Analysis of classified mails using e-STAT showed that current concept drift patterns among Nigerian 419 spammers and provided a blacklist of about 2,173 e-mail sender´s addresses, 563 URLs within spam mails and a total of 13,491 bag-of-words common to Nigerian spam e-mails. The research obtained results that will guide future research in the domain of 419 mails in designing effective spam filters and electronic mail classifiers.
Keywords :
Bayes methods; information filtering; statistical analysis; unsolicited e-mail; Bayesian statistical technique; e-STAT; e-mail; fraudulent Nigeria electronic mails; fraudulent spam mails; statistical content analysis tool; Bayesian methods; Educational institutions; Electronic mail; Filters; Pattern analysis; Postal services; Software testing; Spatial databases; Statistics; Unsolicited electronic mail; 419; Blacklisting; Classifiers; Filtering; IP address; Nigeria; Spam; Spammers;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Adaptive Science & Technology, 2009. ICAST 2009. 2nd International Conference on
Conference_Location :
Accra
ISSN :
0855-8906
Print_ISBN :
978-1-4244-3522-7
Electronic_ISBN :
0855-8906
Type :
conf
DOI :
10.1109/ICASTECH.2009.5409717
Filename :
5409717
Link To Document :
بازگشت