Title :
Identifying Image Spam based on Header and File Properties using C4.5 Decision Trees and Support Vector Machine Learning
Author :
Krasser, Sven ; Tang, Yuchun C. ; Gould, Jeremy ; Alperovitch, Dmitri ; Judge, Paul
Author_Institution :
Secure Comput. Corp., Alpharetta
Abstract :
Image spam poses a great threat to email communications due to high volumes, bigger bandwidth requirements, and higher processing requirements for filtering. We present a feature extraction and classification framework that operates on features that can be extracted from image files in a very fast fashion. The features considered are thoroughly analyzed regarding their information gain. We present classification performance results for C4.5 decision tree and support vector machine classifiers. Lastly, we compare the performance that can be achieved using these fast features to a more complex image classifier operating on morphological features extracted from fully decoded images. The proposed classifier is able to detect a large amount of malicious images while being computationally inexpensive.
Keywords :
decision trees; feature extraction; image classification; support vector machines; unsolicited e-mail; C4.5 decision trees; decoded images; email communications; feature extraction; file properties; filtering; header properties; image classifier; image spam; malicious images; morphological features; support vector machine learning; Bandwidth; Classification tree analysis; Data mining; Decision trees; Electronic mail; Feature extraction; Machine learning; Support vector machine classification; Support vector machines; Unsolicited electronic mail;
Conference_Titel :
Information Assurance and Security Workshop, 2007. IAW '07. IEEE SMC
Conference_Location :
West Point, NY
Print_ISBN :
1-4244-1304-4
Electronic_ISBN :
1-4244-1304-4
DOI :
10.1109/IAW.2007.381941