DocumentCode :
480692
Title :
Splog Filtering Based on Writing Consistency
Author :
Liu, Wei ; Tan, Songbo ; Xu, Hongbo ; Wang, Lihong
Author_Institution :
Inst. of Comput. Technol., Chinese Acad. of Sci., Beijing
Volume :
1
fYear :
2008
fDate :
9-12 Dec. 2008
Firstpage :
227
Lastpage :
233
Abstract :
Splog is the key challenge in the access of blogosphere. Existing splog-filtering methods are restricted to the way for traditional web spam filtering, without considering the characteristics of blogs. Inspired by the observation that fake writers (writers of splogs) have striking higher consistent writing behavior than real writers (writers of legitimate blogs), we propose to detect splogs by distinguishing fake writers from real writers. To measure how consistent the writing behavior is, we propose the consistency-based features derived from writing interval, writing structure and writing topic. Then we designed a splog-filtering system which can use the consistency-based features effectively and flexibly. The experimental results on Blog06 data set show that, proposed measure can effectively detect splogs, reaching an accuracy of 90%. Compared with content-based methods, our approach can get a comparable accuracy with fewer features and smaller train set, indicating that writing consistency represents the essential difference between splogs and blogs.
Keywords :
Web sites; behavioural sciences; security of data; unsolicited e-mail; Blog06 data set; Web spam filtering; fake writers; splog-filtering methods; writing behavior; writing consistency; Blogs; Computer network management; Information analysis; Information filtering; Information filters; Information security; Intelligent agent; Search engines; Web pages; Writing; splog filtering; writing behavior; writing consistency;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT '08. IEEE/WIC/ACM International Conference on
Conference_Location :
Sydney, NSW
Print_ISBN :
978-0-7695-3496-1
Type :
conf
DOI :
10.1109/WIIAT.2008.21
Filename :
4740454
Link To Document :
بازگشت