DocumentCode :
3580555
Title :
Opinion Spam Detection Using Feature Selection
Author :
Patel, Rinki ; Thakkar, Priyank
Author_Institution :
Dept. of Comput. Sci. & Eng., Nirma Univ., Ahmedabad, India
fYear :
2014
Firstpage :
560
Lastpage :
564
Abstract :
In modern times, it has become very essential for e-commerce businesses to empower their end customers to write reviews about the services that they have utilized. Such reviews provide vital sources of information on these products or services. This information is utilized by the future potential customers before deciding on purchase of new products or services. These opinions or reviews are also exploited by marketers to find out the drawbacks of their own products or services and alternatively to find the vital information related to their competitor´s products or services. This in turn allows to identify weaknesses or strengths of products. Unfortunately, this significant usefulness of opinions has also raised the problem for spam, which contains forged positive or spiteful negative opinions. This paper focuses on the detection of deceptive opinion spam. A recently proposed opinion spam detection method which is based on n-gram techniques is extended by means of feature selection and different representation of the opinions. The problem is modelled as the classification problem and Naïve Bayes (NB) classifier and Least Squares Support Vector Machine (LS-SVM) are used on three different representations (Boolean, bag-of-words and term frequency -- inverse document frequency (TF-IDF) ) of the opinions. All the experiments are carried out on widely used gold-standard dataset.
Keywords :
electronic commerce; feature selection; least squares approximations; pattern classification; support vector machines; text analysis; unsolicited e-mail; Boolean representation; LS-SVM; NB classifier; TF-IDF; bag-of-word representation; deceptive opinion spam; e-commerce business; feature selection; forged positive opinions; gold-standard dataset; information sources; least squares support vector machine; n-gram techniques; naïve Bayes classifier; opinion representation; opinion spam detection; product purchasing; service purchasing; spiteful negative opinions; term frequency-inverse document frequency; Equations; Feature extraction; Mathematical model; Support vector machine classification; Training; Unsolicited electronic mail; Feature Selection; Opinion Spam Detection; Text Classification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence and Communication Networks (CICN), 2014 International Conference on
Print_ISBN :
978-1-4799-6928-9
Type :
conf
DOI :
10.1109/CICN.2014.127
Filename :
7065547
Link To Document :
بازگشت