Title :
Contextual feature based one-class classifier approach for detecting video response spam on YouTube
Author :
Chaudhary, Varun ; Sureka, A.
Author_Institution :
Indraprastha Inst. of Inf. Technol. (IIITD), New Delhi, India
Abstract :
YouTube is one of the largest video sharing websites (with social networking features) on the Internet. The immense popularity of YouTube, anonymity and low publication barrier has resulted in several forms of misuse and video pollution such as uploading of malicious, copyright violated and spam video or content. YouTube has a popular and commonly used feature called as video response which allows users to post a video response to an uploaded or existing video. Some of the popular videos on YouTube receive thousands of video responses. We observe presence of opportunistic users posting unrelated, promotional, pornographic videos (spam videos posted manually or using automated scripts) as video responses to existing videos. We present a method of mining YouTube to automatically detect video response spam. We formulate the problem of video response spam detection as a one-class classification problem (a recognition task) and divide the problem into three sub-problems: promotional video recognition, pornographic or dirty video recognition and automated script or botnet uploader recognition. We create a sample dataset of target class videos for each of the three sub-problems and identify contextual features (meta-data based or non-content based features) characterizing the target class. Our empirical analysis reveals that certain linguistic features (presence of certain terms in the title or description of the YouTube video), temporal features, popularity based features, time based features can be used to predict the video type. We identify features with discriminatory powers and use it within a one-class classification framework to recognize video response spam. We conduct a series of experiments to validate the proposed approach and present evidences to demonstrate the effectiveness of the proposed solution with more than 80% accuracy.
Keywords :
Internet; image classification; image recognition; image sensors; social networking (online); unsolicited e-mail; Internet; YouTube; automated script; botnet uploader recognition; contextual feature; copyright violation; dirty video recognition; metadata based feature; noncontent based feature; one-class classifier approach; pornographic video; promotional video recognition; social networking; video pollution; video response spam detection; video sharing website; Feature extraction; Testing; Training; Unsolicited electronic mail; Vectors; YouTube;
Conference_Titel :
Privacy, Security and Trust (PST), 2013 Eleventh Annual International Conference on
Conference_Location :
Tarragona
DOI :
10.1109/PST.2013.6596054