Author_Institution :
Coll. of Math. & Comput. Sci., Fuzhou Univ., Fuzhou, China
Abstract :
Social network has become a very popular way for internet users to communicate and interact online. Users spend a great deal of time on famous social networks (e.g. Facebook, Twitter, Sina Weibo, etc.), reading news, discussing events and posting their messages. Unfortunately, this popularity also attracts a significant amount of spammers who continuously expose malicious behaviors (e.g. Post messages containing commercial topics or URLs, following a larger amount of users, etc.), leading to great inconvenience on normal users´ social activities. In this paper, a supervised machine learning based spammer filtering method is proposed. We first collected a dataset from Sina Weibo that includes 30,116 users and more than 16 million messages, then, construct a labeled dataset of users and manually classify users into spammers and non-spammers, after that, abstract a set of novel features from message content and users´ social behavior, and apply into SVM based spammer classifier. Our experiments show that true positive rate of spammers and non-spammers could reach 99.1% and 99.9%.
Keywords :
Internet; learning (artificial intelligence); pattern classification; social networking (online); support vector machines; unsolicited e-mail; Internet users; SVM based spammer classifier; Weibo social network; event discussion; malicious behaviors; message content; message posting; news reading; normal user social activities; spammer detection; supervised machine learning based spammer filtering method; user labeled dataset; user social behavior; Classification algorithms; Crawlers; Feature extraction; Support vector machines; Twitter; Unsolicited electronic mail; machine learning; social network; spam; spammer; support vector machine;