Title :
Recurring concept detection for spam filtering
Author :
Abad, Miguel Angel ; Gomes, Joao Bartolo ; Menasalvas, Ernestina
Author_Institution :
Univ. Politec. de Madrid, Madrid, Spain
Abstract :
In this work we dig into the problem of recurring concept drifts, proposing a framework to manage them. Its implementation and evaluation phases have been oriented to solve the spam detection problem, taking into account that it is a real-world situation where concepts (spam patterns) may reappear. The possibility of detecting recurring drifts allows to reuse previously learnt models, enhancing the overall learning process specifically in terms of accuracy and efficiency. Consequently, in this paper we propose the Meta-Model Drift Detector (MM-DD). The proposed system is able to deal with the underlying context that results from the drifts detected throughout the data stream learning process. In order to do so, a meta-model is trained in parallel to the learning process. While the learning process of the base classifier is feeding the meta-model with all the context information when a drift occurs, the later is able to predict in the near future recurrent situations. Therefore, when a drift is detected the meta-model checks if the context information is equal to any of the previously managed by the learning process and provides the most suitable stored model to deal with the concept. Our experimental results support the value of the proposed MM-DD in terms of accuracy when compared with existing approaches.
Keywords :
classification; information filtering; learning (artificial intelligence); unsolicited e-mail; MM-DD; base classifier; concept detection; learning process; meta-model drift detector; recurring concept drifts; spam detection; spam filtering; Accuracy; Adaptation models; Context; Context modeling; Prediction algorithms; Training; Unsolicited electronic mail;
Conference_Titel :
Information Fusion (FUSION), 2014 17th International Conference on
Conference_Location :
Salamanca