Title :
High-Order Concept Associations Mining and Inferential Language Modeling for Online Review Spam Detection
Author :
Lai, C.L. ; Xu, K.Q. ; Lau, Raymond Y K ; Li, Yuefeng ; Song, Dawei
Author_Institution :
Dept. of Inf. Syst., City Univ. of Hong Kong, Kowloon, China
Abstract :
Despite many incidents about fake online consumer reviews have been reported, very few studies have been conducted to date to examine the trustworthiness of online consumer reviews. One of the reasons is the lack of an effective computational method to separate the untruthful reviews (i.e., spam) from the legitimate ones (i.e., ham) given the fact that prominent spam features are often missing in online reviews. The main contribution of our research work is the development of a novel review spam detection method which is underpinned by an unsupervised inferential language modeling framework. Another contribution of this work is the development of a high-order concept association mining method which provides the essential term association knowledge to bootstrap the performance for untruthful review detection. Our experimental results confirm that the proposed inferential language model equipped with high-order concept association knowledge is effective in untruthful review detection when compared with other baseline methods.
Keywords :
Internet; data mining; security of data; text analysis; unsolicited e-mail; unsupervised learning; association knowledge; baseline method; fake online consumer review; high order concept associations mining; spam detection; spam features; text mining; unsupervised inferential language modeling; Kullback-Leibler Divergence; Language Modeling; Review Spam; Spam Detection; Text Mining;
Conference_Titel :
Data Mining Workshops (ICDMW), 2010 IEEE International Conference on
Conference_Location :
Sydney, NSW
Print_ISBN :
978-1-4244-9244-2
Electronic_ISBN :
978-0-7695-4257-7
DOI :
10.1109/ICDMW.2010.30