Title :
Wiki-Watchdog: Anomaly Detection in Wikipedia Through a Distributional Lens
Author :
Arackaparambil, Chrisil ; Yan, Guanhua
Abstract :
Wikipedia has become a standard source of reference online, and many people (some unknowingly) now trust this corpus of knowledge as an authority to fulfil their information requirements. In doing so they task the human contributors of Wikipedia with maintaining the accuracy of articles, a job that these contributors have been performing admirably. We study the problem of monitoring the Wikipedia corpus with the goal of automated, online anomaly detection. We present Wiki-watchdog, an efficient distribution-based methodology that monitors distributions of revision activity for changes. We show that using our methods it is possible to detect the activity of bots, flash events, and outages, as they occur. Our methods are proposed to support the monitoring of the contributors. They are useful to speed-up anomaly detection, and identify events that are hard to detect manually. We show the efficacy and the low false-positive rate of our methods by experiments on the revision history of Wikipedia. Our results show that distribution-based anomaly detection has a higher detection rate than traditional methods based on either volume or entropy alone. Unlike previous work on anomaly detection in information networks that worked with a static network graph, our methods consider the network as it evolves and monitors properties of the network for changes. Although our methodology is developed and evaluated on Wikipedia, we believe it is an effective generic anomaly detection framework in its own right.
Keywords :
Web sites; network theory (graphs); security of data; Wiki-watchdog; Wikipedia corpus; contributor monitoring; distribution-based methodology; distributional lens; entropy; information networks; information requirements; online anomaly detection; static network graph; Electronic publishing; Encyclopedias; Entropy; Internet; Measurement; Monitoring;
Conference_Titel :
Web Intelligence and Intelligent Agent Technology (WI-IAT), 2011 IEEE/WIC/ACM International Conference on
Conference_Location :
Lyon
Print_ISBN :
978-1-4577-1373-6
Electronic_ISBN :
978-0-7695-4513-4
DOI :
10.1109/WI-IAT.2011.86