Author_Institution :
Dept. of Comput. Sci., Coll. of New Jersey, Ewing, NJ, USA
Abstract :
Online Social Networks (OSNs) such as Facebook and Twitter are the fastest growing online entities. Because they do not require much authentication for a user to create an account, they are susceptible to social spam attacks. These low-quality, unsolicited, and unwanted bulk messages commonly originate from Social Spam Profiles (SSPs). Spam messages may contain harmful virus links that infect users and propagate throughout OSNs. Twitter, a fast growing OSN site, has a large number of SSPs that have the potential to harm legitimate users. In this paper, a fast and scalable approach is proposed to detect SSPs on Twitter using content, behavioral, and graph-based data. After various investigations, a threshold and associative based classifier is created. Then, the new classifier is compared with the supervised machine learning algorithm, SVM, and two other existing algorithms in terms of accuracy, precision, sensitivity, and specificity. The new classifier with an accuracy of 79.26% is better than SVM with an accuracy of 69.32%. In summary, SSPs are younger, have more statuses, more tweets in succession, and contain keywords that differentiate a spam profile from a non-spam profile.
Keywords :
learning (artificial intelligence); pattern classification; security of data; social networking (online); support vector machines; unsolicited e-mail; Facebook; OSN; SSP; SVM; Twitter; associative based classification; behavioral data; content data; graph-based data; online social networks; social spam attacks; social spam profile detection; spam messages; supervised machine learning algorithm; support vector machines; threshold based classification; Accuracy; Data collection; Media; Support vector machines; Twitter; Unsolicited electronic mail; Behavioral-Based; Content-Based; Graph-Based; Online Social Network (OSN); Social Spam Profile (SSP);