DocumentCode
3314445
Title
A MapReduce based parallel SVM for large scale spam filtering
Author
Caruana, G. ; Maozhen Li ; Man Qi
Author_Institution
Sch. of Eng. & Design, Brunel Univ., Uxbridge, UK
Volume
4
fYear
2011
fDate
26-28 July 2011
Firstpage
2659
Lastpage
2662
Abstract
Spam continues to inflict increased damage. Varying approaches including Support Vector Machine (SVM) based techniques have been proposed for spam classification. However, SVM training is a computationally intensive process. This paper presents a parallel SVM algorithm for scalable spam filtering. By distributing, processing and optimizing the subsets of the training data across multiple participating nodes, the distributed SVM reduces the training time significantly. Ontology based concepts are also employed to minimize the impact of accuracy degradation when distributing the training data amongst the SVM classifiers.
Keywords
ontologies (artificial intelligence); parallel algorithms; pattern classification; support vector machines; unsolicited e-mail; MapReduce; SVM classifiers; large scale spam filtering; ontology; parallel SVM algorithm; spam classification; Accuracy; Filtering; Machine learning; Ontologies; Support vector machines; Training; Training data; Classification; Machine Learning; Ontology Semantics; Parallel Computing; Support Vector Machine;
fLanguage
English
Publisher
ieee
Conference_Titel
Fuzzy Systems and Knowledge Discovery (FSKD), 2011 Eighth International Conference on
Conference_Location
Shanghai
Print_ISBN
978-1-61284-180-9
Type
conf
DOI
10.1109/FSKD.2011.6020074
Filename
6020074
Link To Document