Title :
Chinese spam filtering based on online active learning methods
Author :
Sun, Guanglu ; Ma, Yingcai ; Shen, Yuewu ; Guo, Feng
Author_Institution :
School of Computing Science and Technology, Harbin University of Science and Technology, Harbin, China
Abstract :
In this paper, new active learning methods are proposed to filter Chinese spam. It is time-consuming and expensive to label the spam emails in the large datasets. Active learning methods can conspicuously reduce labeling cost by identifying informative examples and speed up online Logistic Regression filter. The experiments illustrate that our methods not only decrease the number of label requests, but also improve the classification performance of spam filtering.
Keywords :
information filtering; learning (artificial intelligence); regression analysis; unsolicited e-mail; Chinese spam filtering; large datasets; logistic regression filter; online active learning methods; spam emails; Educational institutions; Electronic mail; Filtering; Learning systems; Logistics; Machine learning; Training; Active learning; Chinese spam filtering; Logistic Regression;
Conference_Titel :
Strategic Technology (IFOST), 2012 7th International Forum on
Conference_Location :
Tomsk
Print_ISBN :
978-1-4673-1772-6
DOI :
10.1109/IFOST.2012.6357637