DocumentCode :
651661
Title :
A content-context-centric approach for detecting vandalism in Wikipedia
Author :
Ramaswamy, Lakshmish ; Tummalapenta, Raga Sowmya ; Kang Li ; Pu, Calton
Author_Institution :
Dept. of Comput. Sci., Univ. of Georgia, Athens, GA, USA
fYear :
2013
fDate :
20-23 Oct. 2013
Firstpage :
115
Lastpage :
122
Abstract :
Collaborative online social media (CSM) applications such as Wikipedia have not only revolutionized the World Wide Web, but they also have had a hugely positive effect on modern free societies. Unfortunately, Wikipedia has also become target to a wide-variety of vandalism attacks. Most existing vandalism detection techniques rely upon simple textual features such as existence of abusive language or spammy words. These techniques are ineffective against sophisticated vandal edits, which often do not contain the tell-tale markers associated with vandalism. In this paper, we argue for a context-aware approach for vandalism detection. This paper proposes a content-context-aware vandalism detection framework. The main idea is to quantify how well the words contained in the edit fit into the topic and the existing content of the Wikipedia article. We present two novel metrics, called WWW co-occurrence probability and top-ranked co-occurrence probability for this purpose. We also develop efficient mechanisms for evaluating these two metrics, and machine learning-based schemes that utilize these metrics. The paper presents a range of experiments to demonstrate the effectiveness of the proposed approach.
Keywords :
groupware; learning (artificial intelligence); security of data; social networking (online); CSM; WWW cooccurrence probability; Wikipedia article; abusive language; collaborative online social media applications; content-context-centric approach; machine learning-based schemes; modern free societies; sophisticated vandal edits; spammy words; tell-tale markers; textual features; top-ranked cooccurrence probability; vandalism attacks; vandalism detection techniques; Context; Electronic publishing; Encyclopedias; Internet; Measurement; World Wide Web; Collaborative online social media; WWW co-occurrence probability; content-context; top-ranked co-occurrence probability; vandalism detection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Collaborative Computing: Networking, Applications and Worksharing (Collaboratecom), 2013 9th International Conference Conference on
Conference_Location :
Austin, TX
Type :
conf
Filename :
6679976
Link To Document :
بازگشت