Title :
Data mining with machine learning applied for email deception
Author :
More, Sagar ; Kulkarni, Sandhya A.
Author_Institution :
Dept. of Comput. Sci. & Eng., Visvesvaraya Technol. Univ., Belgaum, India
Abstract :
Spam is also known as junk mail or Unsolicited Commercial Email (UCE) which has become major problem for the sustainability of the internet and global commerce. Everyday millions of the spam mails are sent over internet to targeted population to advertise services, products and dangerous software etc. A number of spam detection algorithms have been proposed to classify emails on content based, but could not gain accuracy. Our proposed work mainly focuses on cognitive (spam) words for classification. This feature is sequential unique and closed patterns which are extracted from the message content. We show that this feature have good impact in classifying spam from legitimate messages. Our method, which can be easily implemented, compares amiably with respect to popular algorithms, like Logistic Regression, Neural Network, Naive Bayes and Random Forest using polynomial kernel as filter. We outperform the accuracy higher compared to related methods. In addition our method is resilient against irrelevant and bothersome words.
Keywords :
Internet; data mining; learning (artificial intelligence); unsolicited e-mail; Internet; UCE; data mining; email deception; global commerce; junk mail; logistic regression; machine learning; message content; naive Bayes; neural network; polynomial kernel; random forest; spam detection algorithms; spam mails; unsolicited commercial email; Classification; Cognitive Words; Polynomial kernel; Spam Detection; Weka;
Conference_Titel :
Optical Imaging Sensor and Security (ICOSS), 2013 International Conference on
Conference_Location :
Coimbatore
Print_ISBN :
978-1-4799-0935-3
DOI :
10.1109/ICOISS.2013.6678403